Requirements

Windows Vista/7/8/10
64 MB RAM
100 MB Hard Disk Space
Stable Internet Connection

What are people saying?

Student License

Recently I was looking for a url extractor program because I am building a search engine and I have tried near 20 of them, all with some missing features. But WDE seems to be the best one right now. I am just a student and the normal price is way high for me. Do you have any student license to offer? I would really appreciate it.

Best Regards, Jeffrey Richardson, Sweden

 

Web Addresses

We downloaded and ran the trial version of your web link extractor. I compared it to another email extractor program and yours kicked it's butt. Your's scanned 9000 files while finding over 1500 links vs. the other only scanned 1200 file, and found only about 400 links. (This was using the exact same search file).

Thanks, Mark Jeter

 

Email List Management

A perfect tool for email marketing mailing lists creation, processing and management.
ListMotor

Versions Web Data Extractor

v8.3 (Released 27.12.2011):

  • Added showing "No emails found" at Emails page
  • Added duplicates removing when loading URLs from a file
  • Fixed a few hangs/deadlocks
  • Added filter combobox at Emails page

v8.2 (Released 06.07.2011):

  • Updated search engines list
  • Added button "Translate" for "Search keywords:" area. This button allows to translate typed keywords into other languages by using Google Translate
  • Added button "Export of domains without any emails" in result tab "Emails"
  • Stability improvements
  • Various fixes

v8.1 (Released 27.12.2010):

  • Reworked multi-threading engine.
  • Improved URL-parser
  • Various fixes

v8.0 (Released 01.07.2010):

  • Improved search engines processing
  • Improved search of non-standard HTML formatted sites
  • Improved data autodection (UTF-8, header, etc)
  • Added advances network/connection processing (Human factor option, incorrect server responses, etc)
  • Windows 7 support was added
  • Various fixes

v7.3 (Released 17.06.2009):

  • Improved sites encoding processing
  • Site/Directory/Groups length limitation is now removed
  • Multi-threading engine is improved and now more stable
  • Some improvements for Email filter settings layout
  • Phone and fax scraper improved. The algorithm is more accurate now
  • Fixed some GUI tweaks
  • Improved crawling for sites with non-valid content or wrong server responses
  • Fixed "Stay with full url" and "Follow offsite links" options which failed for some sites before

v7.2 (Released 10.04.2009):

  • We are continuously improving our product. This version contains improved algorithms to find emails, phones and faxes, based on our users experience and feedbacks
  • Set of common used phone/fax prefixes extended
  • User defined email scraping prefixes are added
  • More powerfull Intelligent spidering, it supports all user defined prefixes now
  • Improved search engines processing
  • Smart detection of the binary data
  • User interface tweaks and changes, including Tray icon and more

v7.1 (Released 15.04.2008):

  • Phone and fax numbers search algorithm has been improved
  • The function that allows to create a new session was added. The peculiarity of this function is the installation of standard settings after the reset of search parameters
  • If you define keywords while searching on the site or on search engines the new function is provided. A column with keywords based on which the search was performed appears in results table. Each found link conforms to the keyword that appeared with that link on the webpage in search
  • The search algorithm of the links represented as table elements has been improved. The following situation is possible: the prefix is located in one table cell and the number in another. The previous software version could have skipped such a phone number. Now this defect has been fixed
  • The function performance allowing to switch to external links has been fixed
  • Vista support was added
  • The possibility to edit program settings without clearing search results was added
  • The set of the existing search engines was updated and enhanced, search engines for more countries were added, all non-existent search engines were deleted
  • The engine was revised and optimized
  • Small bugs were fixed

v6.0 (Released 17.04.2007):

  • New design
  • Database supports
  • Faster parsing
  • Updated search engines
  • Many other improvements

v5.0 (Released 06.07.2006):

  • New interface
  • New parsing
  • Ability to reopen session with extracted data
  • Updated search engines
  • Many other improvements

v4.3 (Released 20.09.2005):

  • Powerful email filtering functions
  • Updated search engine listing
  • Extracting urls from search results only and other improvements

v4.0 (Released 11.12.2004):

  • Search Engine support of 30 countries
  • Page body extraction module
  • Keyword generator
  • Url generator
  • Intelligent spidering mode to extract more data without spidering entire site
  • Set 'meta tag size' limit option

 

Versions Web Data Extractor Professional

v3.10 (Released 06.01.2020):

  • Significantly improved parser of email addresses
  • User agents list has been updated
  • Added "Retry non-extracted URLs" and "Enhanced Human factor" options in Connection for even more effective work with target websites
  • Added options "Check each X minutes" and "Renew after it has read Y number of links" in «Proxy Servers» for more effective work with proxies
  • Many improvements have been made according to our customers reviews!

v3.9 (Released 30.12.2018):

  • List of search engines is cleared of outdated/broken links. This allowed us to increase the speed of the software in «Search engines» mode
  • Significantly improved email addresses parser, especially for JS (JavaScript) hidden emails
  • Improved option to import own proxy servers from CSV files
  • Improved work with HTTPS websites
  • Improved performance when working with large URL lists
  • Improved "Cookie Capture" option
  • Various minor fixes/improvements according to customers’ feedbacks

v3.8 (Released 29.12.2017):

  • Added ability to load and extract information from PDF files
  • Added ability to load the license file directly from the UI form, when the trial period of using the program expires. Alternatively, the license file can be uploaded from the Options -> About form if the trial period has not yet expired
  • Significantly improved work through the proxy servers
  • Parser of encoded JS-emails has been improved
  • The context menu item "Re-start URL" was added to the "Bad URLs" list
  • Improved work with the software internal data repository
  • Added the ability to delete sessions along with all it’s data and the service files, also software automatically compress the internal repository of the program to reduce the required disk space
  • Added "Initial Referer" text field in UI. Some websites may display different information depending on which external site they come from. The "Initial Referer" field allows you to specify the web address of such a site
  • We also made various minor changes and improvements based on feedbacks from our customers

v3.7 (Released 28.02.2017):

  • Improved work of "Search Engines" mode
  • Improved "Remove HTML Tags" and "Page must contain the following text to extract data" filters
  • Added "Use country IP filter" filter which allows to exclude results of servers which does not related (by geolocation) to country selected in "Search Engines» option
  • Significantly improved email parser and «Custom Builder» parser
  • General improvements in data detection and extraction
  • We also made various minor changes and improvements based on feedbacks from our customers

v3.6 (Released 22.08.2016):

  • Added checkbox "Get redirected URL" on the "Custom Data Editor" form to extract urls (e.g. website addresses) that are presented through a redirect
  • Added checkbox "Mark Non-Responding Proxies Like Inactive Automatically". If during the session proxy server determined as «bad» (not working), it is automatically marked as inactive, and it’s not used in the session
  • Added new option "Use single line merge" to merge data into a single string. For example, you can export t-shirt colors like: "T-Shirt", "Black, Yellow, Red, Green"
  • Significantly improved loading of public proxy servers from the Internet
  • "Human Factor" option has been improved
  • Improved a parser of closed by JS email adresses
  • Improved option of passing Google-captcha when searching data via Google
  • We also made various minor changes and improvements based on feedbacks from our customers

v3.5 (Released 28.10.2015):

  • Significantly improved mechanism of searching data through search engines (added a mechanism to work with Google captcha etc.)
  • Added the ability to capture cookies (new button «Capture Cookie») and run a session with cookies (it is very useful in cases where the parameters of the search forms through cookies)
  • Added ability to import a proxy servers from the service where laid out fresh proxies every 30 minutes. Imports about 100-140 proxies. Each new import changes the earlier downloaded list. During the session, the server which became 100% inoperative, will automatically become inactive so in the list remain only actual servers
  • Added new parser to decrypt hidden by javascript email addresses
  • Revised and improved server errors handling, which has a positive impact on work through proxy servers
  • Improved email/fax adresses parser
  • Various minor improvements

v3.4 (Released 03.09.2015):

  • Improved parser of javascript protected email adresses, added 2 new decoders
  • Improved algorithm for merge the data for export
  • Added checkbox "Add in results" in filter "URL Filter: Page must contain the following text to extract data". If you turn it on, then the results table will have with the keywords of this filter, that satisfy the search criteria when retrieving data
  • Improved parser of links, added case that cover not quoted links in the page sources
  • Software improved for work with large data
  • Improved export data mechanism
  • Improved filter mode "Url List" work
  • Added recognition of servers that do not support the issuance of uncompressed content and a form correct request to such servers
  • Added new search engine - IXQUICK. It does not safe IP and searches in main search engines. With this engine you can spider for days without beeing blocked
  • Fixed "Object null reference" issue
  • Various minor additions/fixes

v3.3 (Released 05.05.2015):

  • Improved parser of javascript protected email adresses
  • Improved handling of network errors. Now better recognized temporarily unavailable pages, for example due to high activity on the server
  • Added use of regular expressions in filters. To recognize a regular expression, please enclose it between the symbols "^" and "$"
  • Added detection of specific symbols of the German language in urls
  • Added "Recovery" button in the settings. It allows you to export all the collected data for the selected date range, even if the main database of the program has been damaged for some reason
  • Added the ability to export data to Excel file format
  • Added the ability to save the results in multiple files, if there are too many results. For example, you can specify that the file is saved in one of 10,000 lines (supported range of values 1 - 1,000,000) and we get the results - main file "Results.(txt|cvs|xlsx)" and more, automatically generated files "Results_XXXX.(txt|cvs|xlsx)" for each additional 10,000 lines
  • Greatly improved algorithm for traversing large sites containing millions and tens of millions links
  • Various fixes/additions based on your feedbacks

v3.2 (Released 30.12.2014):

  • Added an option “Remove duplicates” for phone and faxes
  • Principe of extracting links and domains was changed (in case check-boxes “URLs” and “Domains” were marked in session's form) for “Search Engines” mode. Now these lists include urls on websites and their domains, which have sought-for keywords. Before this list consisted of all founded urls, which made it not useful in “Search Engines” mode
  • We have increased the maximum depth of search on websites from 10 till 100
  • We have improved parsing of emails, which are protected by javascript, we have added algorithm for decryption of new kind of emails protection
  • Added the possibility to search in local website copies on the disk. For example, using this way "c:\inetpub\wwwroot\spadix". We can set it up as “Start URL” in “Site” mode as well as in the links file for “Url List” mode
  • We have increased the stability of programs' work. Now in cases of abrupt computer reload or system's breaking, all datas collected during one session will be saved. Auto-saving works within 30 seconds gap
  • Now for program work .Net Framework 4.0 is required (Client Profile or full profile)
  • SQLire library is upgraded to the last version
  • Fixed crashes in some cases when building in “Custom Data Editor”

v3.1 (Released 05.09.2014):

  • Added the ability to edit url and email filters in stopped session, and then to continue with already edited filters
  • Added the ability to download the list of proxy servers from the text files (*.txt). Also we added support of files with format like “host:port”
  • Added progress in percents for requests. Now the list of requests updates very quickly
  • Added the name of proxy, through what the request to field “Title” is sending (for running requests only)
  • Improved the dispatch on proxies – now with a big list of proxies it works much more efficiently

v3.0 (Released 23.06.2014):

  • Added support of working with proxy servers' list
  • Various small fixes/additions based on your feedbacks

v2.3 (Released 08.01.2014):

  • Added the ability to retrieve data with preservation of custom HTML markup (checkbox "Remove HTML tags")
  • Improved extraction of meta tags
  • Fixed errors when exporting to csv
  • Various small fixes/additions based on your feedbacks

v2.2 (Released 15.05.2013):

  • Bug fixes and improvements on customers requests
  • "Remove duplicates" option added to "Email Filter"
  • New regular expressions builder for custom data search. Now you can choose one or two similar text blocks. It works even with one block chosen. It can be useful when you need to extract company details, for example address, phone numbers etc and the webpage has information only for one company.

v2.1 (Released 27.12.2012):

  • Significantly enhanced phones and faxes parser
  • Additional filters for phones and faxes are added. Now you can indicate which figures should be comprised to the phone number as well as maximum length of phone/fax number
  • Now you can upload/download session settings to/from the file
  • Command line arguments support is added: you can start session from the saved setups and indicate the file for input records
  • We added a possibility to make “advanced” custom data results merge – now you can gather structured data from online shops and other places

v2.0 (Released 29.08.2012):

  • Visual expression builder - it was never that easy to configure your custom extraction expressions
  • New updated help with a lot of use cases and examples
  • Merging results when saving to a file or copying to the clipboard
  • Great results filtering option
  • New feature - "Collect Domains Without Emails"
  • Many visual and engine changes/fixes

v1.2 (Released 07.06.2012):

  • Ability to scan RSS feeds is added
  • Program sustainability to the physical damage of the database is added
  • Improved streams control, which has a positive impact on the overall performance
  • Ability to determine such types of email adresses as info[at]mail.com and info(at)mail.com is added
  • Decoding of the hidden email adresses using java script is added
  • Scan time is now displaying without a split seconds and includes a days indication
  • Improved work with a large list of keywords in "Search Engines" mode
  • Added quotes support in keywords to search for exact phrases and words in Google
  • Reworked the algorithm for determining the depth of scan (Url Depth)
  • Improved filter to screen out potentially incorrect phones and faxes
  • Added ability to set the "Fixed Number Pages" in "Search Engines" mode
  • Added ability to define tag "<base href .."
  • SQLite engine updated
  • Search for new countries added: Arabia, Argentina, Chile, Philippines, Singapore. Also checked and corrected the existing list of search queries
  • Various fixes

v1.1 (Released 04.04.2012):

  • Significantly improved recognition of phones and faxes
  • Updated list of search engines. Added specific Arabic, Chinese and other search engines
  • Added extended copy of results to the clipboard
  • Design completed: aligned positions / sizes of controls, buttons now have a new style
  • Various fixes

v1.0 (Released 12.03.2012):

  • Completely new powerful spidering engine
  • Completely reworked UI - slick & sexy
  • Pro version of WDE doesn't have any limits - feel free to process thousands of sites, gigabytes of data
  • Extremely fast search and accuracy
  • Extract any data you want by Custom data extraction
  • Robust .Net engine keeps searching for days without interruptions
  • New session management allows you manage huge amount of data
  • Brand new simplified user interface