mersenneforum.org Bashing Strings Parsimoniously in Linux
 2013-05-03, 14:00 #1 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 510210 Posts Bashing Strings Parsimoniously in Linux Any BASH gurus looking for a frustrated "amateur" programmer to help out? I'm trying to do what I thought would be a simple task, but it seems to be more trouble than it's worth... the crux: I have a string: Code: [May 02 2013, 23:52:50] Cofactor 127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873 (105 digits) I want only the inner number. I have successfully trimmed the front: Code: echo $templine echo cpos=expr index "$templine" C let cpos=$cpos+8 templine=${templine:$cpos} echo$templine result: Code: [May 02 2013, 23:52:50] Cofactor 127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873 (105 digits) 127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873 (105 digits) But, I'm having difficulty trimming the end off. I can't find any way to search for a space or open parenthesis. I'll skip telling all the things I've tried, since they didn't work. Any assistance is gratefully appreciated. Thanks...
2013-05-03, 14:13   #2
chalsall
If I May

"Chris Halsall"
Sep 2002

2×5×1,087 Posts

Quote:
 Originally Posted by EdH Any BASH gurus looking for a frustrated "amateur" programmer to help out?
Code:
[chalsall@burrow edh]$cat input.txt [May 02 2013, 23:52:50] Cofactor 127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873 (105 digits) [chalsall@burrow edh]$ sed 's/.* Cofactor $$[0-9]*$$.*/\1/' input.txt
127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873

 To remove the prefix and the suffix just with bash you can do something like: Code: withoutsuffix=${templine% \(*} number=${withoutsuffix##* }
 2013-05-03, 15:22 #4 EdH     "Ed Hall" Dec 2009 Adirondack Mtns 10011111011102 Posts Excellent! Thanks for the help. I've implemented Gimarel's code into my script at this point, but will be studying Chalsall's further as well. Could someone point me toward a good study reference? I would like to specifically find out how to parse a string that includes several double quote characters, as well as understand fully the previous posts. Thanks much!!
2013-05-03, 15:28   #5
chalsall
If I May

"Chris Halsall"
Sep 2002

2×5×1,087 Posts

Quote:
 Originally Posted by EdH Could someone point me toward a good study reference?
For sed (Stream EDitor), this is pretty good.

If you haven't wrapped your head around Regular Expressions (regex) yet, just Google for things like "regular expression cheat sheet", "regular expression examples", etc. Regex can seem a little intimidating initially, but they're well worth the effort to learn.

2013-05-03, 15:54   #6
EdH

"Ed Hall"
Dec 2009

2·2,551 Posts

Quote:
 Originally Posted by chalsall For sed (Stream EDitor), this is pretty good. If you haven't wrapped your head around Regular Expressions (regex) yet, just Google for things like "regular expression cheat sheet", "regular expression examples", etc. Regex can seem a little intimidating initially, but they're well worth the effort to learn.
Thanks! I was using the following for reference:

Advanced Bash Shell Scripting Guide - A Brief Introduction to Regular Expressions

But, I couldn't find a way to parse the space or ( in my previous posts.

I had briefly looked at sed and awk as referenced somewhere, but thought I'd better spend some time with BASH before branching off too much. It also seemed like sed is for working within a file, which didn't seem to be what I was doing, and awk was yet another scripting tool (or, so it seemed).

Anyway, thanks for all the help.

Off to study...

2013-05-03, 16:05   #7
chalsall
If I May

"Chris Halsall"
Sep 2002

2×5×1,087 Posts

Quote:
 Originally Posted by EdH I had briefly looked at sed and awk as referenced somewhere, but thought I'd better spend some time with BASH before branching off too much. It also seemed like sed is for working within a file, which didn't seem to be what I was doing, and awk was yet another scripting tool (or, so it seemed).
To be perfectly honest, I've never taught myself much BASH scripting. My position is if I'm going to script something, I'll write it in Perl which is much more capable. Plus, the code then can be moved into CGI scripts, will run under Windows, etc.

But, no matter what tool you are using, understanding Regex is key.

Quote:
 Originally Posted by EdH Anyway, thanks for all the help.
You are most welcome.

2013-05-04, 09:46   #8
Mr. P-1

Jun 2003

100100100012 Posts

Quote:
 Originally Posted by EdH I have a string: Code: [May 02 2013, 23:52:50] Cofactor 127881022016992373630150068641535329304189051054726878835804414980494663758554596515035599713886561221873 (105 digits) I want only the inner number.
Code:
cut -d' ' -f6

2013-05-04, 19:26   #9
EdH

"Ed Hall"
Dec 2009

10011111011102 Posts

Quote:
 Originally Posted by Mr. P-1 Code: cut -d' ' -f6
I "think I" can see what this does, but not sure how to implement it, yet. "The implementation is left as an exercise for the reader." Off to exercise...

2013-05-04, 22:30   #10
Nick

Dec 2012
The Netherlands

5·353 Posts

Quote:
 Originally Posted by chalsall But, no matter what tool you are using, understanding Regex is key.
Absolutely. And understanding the way that regular languages correspond with finite automata is the beginning of real computer science.

2013-05-04, 22:54   #11
chalsall
If I May

"Chris Halsall"
Sep 2002

1087010 Posts

Quote:
 Originally Posted by Nick Absolutely. And understanding the way that regular languages correspond with finite automata is the beginning of real computer science.
Possibly...

What is called Computer Science currently is really just Engineering.

IMHO, when we finally get to Computer Science (and we're close), it will be non-deterministic....

