mersenneforum.org regular expression help
 Register FAQ Search Today's Posts Mark Forums Read

2009-03-01, 02:21   #1
ixfd64
Bemusing Prompter

"Danny"
Dec 2002
California

5·499 Posts
regular expression help

I'm having a bit of trouble with regular expressions in R.

I know the following regular expression

Code:
'^.*([-A-Za-z0-9_.%]+@[-A-Za-z0-9_.%]+\\.[A-Za-z]+).*$' is supposed to return e-mail addresses using backreferences. However, Quote:  emailpat = '^.*([-A-Za-z0-9_.%]+@[-A-Za-z0-9_.%]+\\.[A-Za-z]+).*$' gsub(emailpat, '\\1', 'adadasd>aaaaaaabbbbbbb@cccc.com
returns

Quote:
 [1] "b@cccc.com"
and cuts off the first part of the e-mail except for the first letter. Does anyone know what I'm doing wrong?

Thanks.

 2009-03-01, 03:33 #2 wblipp     "William" May 2003 Near Grandkid 45058 Posts I've run into different definitions of regular expressions from time to time. But assuming your situation is like described here: http://www.regular-expressions.info/reference.html your problem is the the first ".*" is greedy, trying the longest possible matches first. I think you need to make it lazy, changing the .* to .*? Code: emailpat = '^.*?([-A-Za-z0-9_.%]+@[-A-Za-z0-9_.%]+\\.[A-Za-z]+).*$' If your environment doesn't support lazy, then I'd suggest a "not a regular character" before the "one or more regular characters and outside the parenthesis. so Code: emailpat = '^.*[^-A-Za-z0-9_.%]([-A-Za-z0-9_.%]+@[-A-Za-z0-9_.%]+\\.[A-Za-z]+).*$' good luck William
 2009-03-01, 06:19 #3 ixfd64 Bemusing Prompter     "Danny" Dec 2002 California 5·499 Posts The first one didn't work but the second did. Thanks so much!

 Similar Threads Thread Thread Starter Forum Replies Last Post jasong jasong 25 2016-01-21 16:03 wildrabbitt Hardware 8 2015-06-22 10:29 Raman Puzzles 21 2009-12-09 20:25 CRGreathouse Math 3 2009-12-09 19:56 Speedbump Information & Answers 1 2009-07-25 00:51

All times are UTC. The time now is 07:40.

Tue Jan 31 07:40:39 UTC 2023 up 166 days, 5:09, 0 users, load averages: 0.92, 1.19, 1.17