Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!


Using regex in shell
New on LowEndTalk? Please Register and read our Community Rules.

All new Registrations are manually reviewed and approved, so a short delay after registration may occur before your account becomes active.

Using regex in shell

mahjongmahjong Member
edited April 2012 in Help

Hi all!

I started to write a really small shell script, but i'm really stucked with a key part. I have a variable (multi line) and i would like to match only an url in it with regex. If i would use php, i would use preg_match. I read a lot, and found grep, which is not good for me (as i don't need the whole line), so i guess the only one working would be awk (correct me if i am wrong)

So please help me to find a regex pattern matcher in shell.

Code:
pastebin.com/TczfVMK2

Please grab the src=" part with shell regex into a variable (not a text file), that will give me the command i need.

Thank you!

PS: The code is just an example, you can send any kind of regex example, the important part is: it should find a matching pattern in a variable and output it into a variable. The matching pattern is not a whole line, only a little part in a long html/text document.

Comments

  • netomxnetomx Moderator, Veteran

    Please put your code on < pre > tags

    Thanked by 1TheHackBox
  • yomeroyomero Member
    edited April 2012

    Try something with 'sed'

    (Or use perl)

  • flyfly Member

    ed is probably better.

    although i still recommend you use php... there's a reason why they wrote parsers for php

    Thanked by 2TheHackBox netomx
  • nabonabo Member

    @mahjong said: So please help me to find a regex pattern matcher in shell.

    There you go: http://www.math.utah.edu/docs/info/gawk_5.html

  • flyfly Member
    edited April 2012

    yeah awk is pretty strong.

    it IS a little bit difficult to get going at first, but you'll be regex pro once you get the hang of it.

  • Something like this?

    var=`cat stuff.txt | sed -n 's/.*src=\"\(http.*\)\".*$/\1/p'`
    
  • yeah, thanks for all. I thought sed is only to replace, but i will try it. Also i know gawk/awk is really good option, but currently i don't have time to learn it and just need a fast preg_match alternative.

  • sed/awk ftw.
    And btw, there are different versions of awk. Some are faster than others, if you need to process heavy files.

  • Mon5t3rMon5t3r Member
    edited April 2012

    while i'm stuck, and nothing i can do with my (shell/bash) script :

    cat /dev/null > file.sh

    and start a new life after that..

  • yomeroyomero Member
    edited April 2012
    $ > file.sh
    

    And start a new life faster xD

    Thanked by 1Mon5t3r
  • MrDOSMrDOS Member
    edited April 2012

    grep with the -o option?

    $ grep -o 'src=".*"' < TczfVMK2.txt
    src="http://ajax.googleapis.com/ajax/libs/swfobject/2.1/swfobject.js"
    src="clientscript/hsjs.php"
    

    Then a couple sed rules to chop off the attribute:

    $ grep -o 'src=".*"' < TczfVMK2.txt | sed -e 's/^src="//' -e 's/"$//'
    http://ajax.googleapis.com/ajax/libs/swfobject/2.1/swfobject.js
    clientscript/hsjs.php
    
    Thanked by 2yomero exussum
  • @MrDOS said: grep with the -o option?

    OMFG! New tricks! :D

  • MrDOSMrDOS Member
    edited April 2012

    :D With enough piping, one can do almost anything in the world with a combination of cat, grep, sed, wc, expr, and seq. (Or, alternatively, one can do almost anything awk can do with enough piping and some combination of those...)

    And actually, now that I have a little more time to look at this, I notice that really only one sed expression is needed if it uses a backreference:

    grep -o 'src=".*"' < TczfVMK2.txt | sed -e 's/^src="\(.*\)"$/\1/'
    

    (Yes, escaping is needed on the parentheses even though they're within single-quotes.)

Sign In or Register to comment.