python - extract price from html tag -
this question has answer here:
<dt class="col2"> <p>rs. 2691.00 </p> </dt> from above html code,i need extract price using regular expressions.i used beautifulsoup parsing.
can propose regular expression above?
if you're trying "2691.00" use:
(?<=rs\.)\s*(\d+\.\d{2}) most regex engines can't * in lookbehind, make dynamic enough not fail if there's more 1 space left in main group. can either use main match , trim off excess spaces or use capture group 1.
(?<= ) positive lookbehind. tells regex engine whatever inside of has matched before main matching group, don't include in match.
rs\. matches "rs.". in regex . character matches have escape match period.
\s matches spaces.
* matches between 0 , infinity.
\d matches numbers.
+ matches between 1 , infinity. similar * has find @ least 1 successful match.
{2} means has find 2 of whatever before it. \d{2} same \d\d.
and have parenthesis around price's match create group. allows extract group entire match. used further if want extract "dollar" amount or change with:
((\d+)\.(\d{2})) then... , may have order wrong... capture group 1 contain 2691.00, capture group 2 contain 2691, , capture group 3 contain 00
Comments
Post a Comment