Search for two words in a certain order, but not next to each other

Eric

Beta Tester
Messages
885
I hope I haven't asked this before :)

In the TR module I want to search for the Greek words και and δε in that order, but not when they are contiguous (which they never are). I know I can search "for verses with ALL of the words specified", but that gives 1431 vv., and they are mostly false positives as far as what I'm looking for.

How can I search for two words, one following the other, but not necessarily immediately after it?

Thank you!
 
The only way to do this is with a RegEx search. Unfortunately, PCRE RegEx doesn't handle character classes in Unicode very well, so it gets complicated. After some experimentation, I think this RegEx search should work for you:

[ ^]και .* δε[ $]

219 verses.
 
(1) It seems that that RegEx search – [ ^]και .* δε[ $] –is not catching the ones where και is the first word in the verse, e.g., Joh 8:16 and 17 both have that construction, but are not found. Is that fixable?
(2) Is there any way to shorten it down to say, the και before the δε, but the 2nd word not more than x words from the 1st?
 
(1) It seems that that RegEx search – [ ^]και .* δε[ $] –is not catching the ones where και is the first word in the verse, e.g., Joh 8:16 and 17 both have that construction, but are not found. Is that fixable?

You could remove the beginning character class, the [ ^] part, but you will have more verses to sift through. The version of PCRE being used internally doesn't handle the word boundaries very well in the Unicode range, sorry.

(2) Is there any way to shorten it down to say, the και before the δε, but the 2nd word not more than x words from the 1st?

Probably, but this is already problematic, and I can't think of the construction off the top of my head.
 
Thank you, Brandon. I can sort through them – I'd rather have all of them and have to sort through them than miss some. Thanks for your time!
 
(2) Is there any way to shorten it down to say, the και before the δε, but the 2nd word not more than x words from the 1st?

Actually if you don't actually care how many words, it wouldn't be hard to add a quantifier to force a max number of LETTERS between the words:

( |^)και .{0,15} δε( |$)

Change 15 to whatever you want the maximum number of characters between to be.

Also that should fix the issue with the beginning of the verse without including words that end in και.
 
Great! Just what I needed.

I think I found a little display bug though, in using these. When I changed the # of characters to 25, I got 107 hits. The first 100 have the hits highlighted. But when I go forward to the next seven, they are not highlighted.
 
I'm just documenting this here so I can find it later. :) Today I used this search

( |^)τε .{0,15} και( |$)

in the RegEx search tab in SwordSearcher 8.4 to find τε before and within 15 characters of και in the TR module.
 
I am just documenting this for my info and the info of any others who may find it interesting.

I want to find verses where forms of the verb "stand" are followed (not necessarily immediately) by "before". Jos 7:12 and Jos 21:44 are examples. 2Ki 2:4 is also an example.

What interests me is that the expression in KJV English of "not standing before" someone in some contexts seems to mean not to be able to resist / withstand someone. I guess it means "not able to remain standing", i.e., to put it in convoluted English with two negatives: "not able (strong enough) not to be overcome". In any case I searched with this:
[ ^]stand .* before[ $]
in the RegEx search gave 14 vv. with hits.
[ ^]stood .* before[ $]
also returned 14 vv. with hits.
( |^)stand .{0,100} before( |$)
which searches within 100 characters also returned 14 vv. with hits.
( |^)stood .{0,100} before( |$)
got 13 hits, and moving up in 25-character increments I finally got 14 hits when I used "175" as the # of characters.
 
Apparently
Code:
( |^)stand .{0,175} before( |$)
does not find "stand before" when the two words are right next to each other. See Jos 23:9 which is not a hit with this RegEx search. So one needs to use the RegEx searches and do simple searches for "stand before", "stood before", "standeth before" in order to get all instances of this.
 
Apparently
Code:
( |^)stand .{0,175} before( |$)
does not find "stand before" when the two words are right next to each other. See Jos 23:9 which is not a hit with this RegEx search. So one needs to use the RegEx searches and do simple searches for "stand before", "stood before", "standeth before" in order to get all instances of this.

Remove the extra space:

Code:
( |^)stand .{0,175}before( |$)
 
If I may suggest an alternative:

Code:
\b(stand|stood)\b.*?\bbefore\b

  • Assert position at a word boundary (position preceded or followed—but not both—by an ASCII letter, digit, or underscore)
  • Match the regex below and capture its match into backreference number 1
    • Match this alternative (attempting the next alternative only if this one fails)
      • Match the character string “stand” literally (case insensitive)
    • Or match this alternative (the entire group fails if this one fails to match)
      • Match the character string “stood” literally (case insensitive)
  • Assert position at a word boundary (position preceded or followed—but not both—by an ASCII letter, digit, or underscore)
  • Match any single character that is NOT a line break character (line feed, carriage return, form feed, vertical tab, next line, line separator, paragraph separator)
    • Between zero and unlimited times, as few times as possible, expanding as needed (lazy)
  • Assert position at a word boundary (position preceded or followed—but not both—by an ASCII letter, digit, or underscore)
  • Match the character string “before” literally (case insensitive)
  • Assert position at a word boundary (position preceded or followed—but not both—by an ASCII letter, digit, or underscore)
 
I am just documenting this for my info and the info of any others who may find it interesting.

I want to find verses where forms of the verb "stand" are followed (not necessarily immediately) by "before". Jos 7:12 and Jos 21:44 are examples. 2Ki 2:4 is also an example.

What interests me is that the expression in KJV English of "not standing before" someone in some contexts seems to mean not to be able to resist / withstand someone. I guess it means "not able to remain standing", i.e., to put it in convoluted English with two negatives: "not able (strong enough) not to be overcome". In any case I searched with this:

in the RegEx search gave 14 vv. with hits.

also returned 14 vv. with hits.

which searches within 100 characters also returned 14 vv. with hits.

got 13 hits, and moving up in 25-character increments I finally got 14 hits when I used "175" as the # of characters.
It is very interesting, Eric. I don't understand regex coding at all, so your practical example is helpful. Thanks for sharing!
 
Back
Top