Wednesday, March 21, 2012

passing a string to a regex

I'm using a regex and want to find a specific tag, like TABLE
this:
(< )(table)
matches this:
<table
<turkey
<amazon
ie, it match < followed by a t or a or b or l or e (at least that's what I
*think* it is doing ;o)
What's the proper way to write out an actual string for matching? I theory,
this should work:
(t)(a)(b)(l)(e)
THe catch is that I'd like to pass 'table' as a string to this. So I'd like
to avoid having to split up the string as an array and then having to build
it like the above.If your string to match against is table, your expression is just the word
table itself, you may want to add RegexOptions.IgnoreCase
"darrel" <notreal@.hotmail.com> wrote in message
news:OGT84LeZEHA.3512@.TK2MSFTNGP12.phx.gbl...
> I'm using a regex and want to find a specific tag, like TABLE
> this:
> (< )(table)
> matches this:
> <table
> <turkey
> <amazon
> ie, it match < followed by a t or a or b or l or e (at least that's what I
> *think* it is doing ;o)
> What's the proper way to write out an actual string for matching? I
theory,
> this should work:
> (t)(a)(b)(l)(e)
> THe catch is that I'd like to pass 'table' as a string to this. So I'd
like
> to avoid having to split up the string as an array and then having to
build
> it like the above.
>
> If your string to match against is table, your expression is just the
word
> table itself, you may want to add RegexOptions.IgnoreCase
So...this:
(table)
should only match "table"?
That doesn't seem to be happening for me--but maybe I have another issue
with my expression somewhere else. Good suggestion on the IgnorCase option,
though!
-Darrel

> "darrel" <notreal@.hotmail.com> wrote in message
> news:OGT84LeZEHA.3512@.TK2MSFTNGP12.phx.gbl...
I
> theory,
> like
> build
>
You dont need the ( ) , its only a grouping construct, however, it should
still work. Post some code so we can take a look.
"darrel" <notreal@.hotmail.com> wrote in message
news:uV4mJjeZEHA.2444@.tk2msftngp13.phx.gbl...
> word
> So...this:
> (table)
> should only match "table"?
> That doesn't seem to be happening for me--but maybe I have another issue
> with my expression somewhere else. Good suggestion on the IgnorCase
option,
> though!
> -Darrel
>
what
> I
>
Here's what I have:
dim r1 as new regex( _
"(?<anythingPreceding>((.|\n)*))" & _
"(?<theTag>(" & tagToFind & "))" & _
"(?<anything>(.[^>/]*))" & _
"(?<theAttribute>(" & attributeToFind & "))" & _
"(?<theEqualsSign>((\s*)=(\s*)))" & _
"(?<theAttributeValue>(.[^\s/>]*))" & _
"(?<anythingSucceeding>((.|\n)*))" _
, RegexOptions.IgnoreCase)
dim m as Match = r1.Match(textToParse)
dim r2 as New Regex("(" & attributeToFind & ")((\s*)=(\s*))(.[^\s/>]*)")
dim s as String = r2.replace(m.tostring, attributeToFind & "=""" &
newAttributeValue & """")
return s
Note that the second group (theTag) is the one I'm concerned with.
If I past "table" to tagToFind, it will match these:
<table width='50' height='12'>
<turkey width='50' height='12'>
<apple width='50' height='12'>
So, I *thought* that it was an OR construct.
However, I now notice that it will also match:
<spaghetti width="100%" >
So there's obviously something wrong with by Regex in the bigger sense. Let
me stare at it for a bit and see if I can figure this one out ;o)
-Darrel
I'm pretty sure this is the culprit:
"(?<anythingPreceding>((.|\n)*))" & _
"(?<theTag>(" & tagToFind & "))" & _
Or at least part of the problem. The first line should match 'anything' up
until "<table" (I'm passing table to tagToFind)
So, I think I need to look for anything EXCEPT "<table". Correct?
I can't quite get the syntax down, though:
(.|\n(^<table))*
that doesn't seam to work
I figured out what's going on.
I'm finding a large match and then applying a second ReGex to it.
I need to learn how to use the groups ;o)
-Darrel

0 comments:

Post a Comment