python - extract price from html tag -

- February 15, 2015

this question has answer here:

regex match open tags except xhtml self-contained tags 35 answers

<dt class="col2"> <p>rs. 2691.00 </p> </dt>

from above html code,i need extract price using regular expressions.i used beautifulsoup parsing.

can propose regular expression above?

if you're trying "2691.00" use:

(?<=rs\.)\s*(\d+\.\d{2})

most regex engines can't * in lookbehind, make dynamic enough not fail if there's more 1 space left in main group. can either use main match , trim off excess spaces or use capture group 1.

(?<= ) positive lookbehind. tells regex engine whatever inside of has matched before main matching group, don't include in match.

rs\. matches "rs.". in regex . character matches have escape match period.

\s matches spaces.

* matches between 0 , infinity.

\d matches numbers.

+ matches between 1 , infinity. similar * has find @ least 1 successful match.

{2} means has find 2 of whatever before it. \d{2} same \d\d.

and have parenthesis around price's match create group. allows extract group entire match. used further if want extract "dollar" amount or change with:

((\d+)\.(\d{2}))

then... , may have order wrong... capture group 1 contain 2691.00, capture group 2 contain 2691, , capture group 3 contain 00

Search This Blog

Search

python - extract price from html tag -

Comments

Post a Comment

Popular posts from this blog

c++ - Creating new partition disk winapi -

VBA function to include CDATA -

php - Warning: file_get_contents() expects parameter 1 to be a valid path, array given 16 -