ruby - Can't figure out why match result is nil -


>> "<img src=\"https://filin.mail.ru/pic?width=90&amp;height=90&amp;email=multicc%40multicc.mail.ru&amp;version=4&amp;build=7\" style="">".match(regexp.new("<a href=\"http(s?):\/\/(?:\w+\.)+\w{1,5}.+?\">|<img src=\"http(s?):\/\/(?:\w+\.)+\w{1,5}.+?\"(?: style=\".+\")?>")) => nil 

but testing in rubular says should catched

link

i can't understand why testing rubular says string should catched, , not.

regex wrong tool handling html (or xml) 99.9% of time. instead, use parser, nokogiri:

require 'nokogiri'  html = '<img src="https://filin.mail.ru/pic?width=90&amp;height=90&amp;email=multicc%40multicc.mail.ru&amp;version=4&amp;build=7" style="">' doc = nokogiri::html(html)  url = doc.at('img')['src'] # => "https://filin.mail.ru/pic?width=90&height=90&email=multicc%40multicc.mail.ru&version=4&build=7" doc.at('img')['style'] # => "" 

once you've retrieved data want, such src, use "right" tool, such uri:

require 'uri'  scheme, userinfo, host, port, registry, path, opaque, query, fragment = uri.split(url) scheme    # => "https" userinfo  # => nil host      # => "filin.mail.ru" port      # => nil registry  # => nil path      # => "/pic" opaque    # => nil query     # => "width=90&height=90&email=multicc%40multicc.mail.ru&version=4&build=7" fragment  # => nil  query_parts = hash[uri.decode_www_form(query)] query_parts # => {"width"=>"90", "height"=>"90", "email"=>"multicc@multicc.mail.ru", "version"=>"4", "build"=>"7"} 

Comments

Popular posts from this blog

c++ - Creating new partition disk winapi -

Android Prevent Bluetooth Pairing Dialog -

VBA function to include CDATA -