Web Scraping with Perl scripting

Similar to various powerful modules, Perl comes with various modules for web scraping, that is extracting required information from web HTML pages. Below is a sample script that I authored to extract the movies information from www.justtollywood.com web site pages. [code language=”perl”] #! perl #=============================================================================== # Objective: # ———- # # Perl script to demo the web scraping modules to extract intended information # from web pages. # # For this example, I used www.justtollywood.com pages. # # $Header: $ #=============================================================================== # Include Modules #=============================================================================== use strict; use warnings; use Pod::Usage; use File::Basename; use HTML::TableExtract; use HTML::TreeBuilder 3; use Getopt::Long […]

Read more

VBScript to Disable/Hide Task Bar in Windows XP

VBScript to Disable/Hide Task Bar in Windows XP [code language=”vb”] ‘xp_taskbar_desktop_fixall.vbs – Repairs the Taskbar when minimized programs don’t show. Set WSHShell = WScript.CreateObject("WScript.Shell") Message = "To work correctly, the script will close" & vbCR Message = Message & "and restart the Windows Explorer shell." & vbCR Message = Message & "This will not harm your system." & vbCR & vbCR Message = Message & "Continue?" X = MsgBox(Message, vbYesNo, "Notice") If X = 6 Then On Error Resume Next WshShell.RegDelete "HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\StuckRects2\" WshShell.RegDelete "HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\StreamMRU\" WshShell.RegDelete "HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer\Streams\Desktop\" WshShell.RegDelete "HKCU\Software\Microsoft\Internet Explorer\Explorer Bars\{32683183-48a0-441b-a342-7c2a440a9478}\BarSize" P1 = "HKCU\Software\Microsoft\Windows\CurrentVersion\Policies\Explorer\" WshShell.RegWrite p1 & "NoBandCustomize", 0, "REG_DWORD" WshShell.RegWrite […]

Read more

Perl script: HtmlAsText.pl to convert HTML content into Text File format

This program is to illustrate how tools like HtmlAsText usually work. [code language=”perl”] #! perl #=============================================================================== # Objective: # ———- # # Script to convert HTML File content into Text File # # # $Header: $ #=============================================================================== # Include Modules #=============================================================================== use strict; use warnings; use Pod::Usage; use LWP::Simple; use File::Basename; use HTML::Table; use Alvis::HTML; use HTML::TableExtract qw(tree); use Getopt::Long qw(:config no_ignore_case bundling); #=============================================================================== # Global Variables Declaration #=============================================================================== use vars qw($DEBUG $SRC_FLDR $DEST_FLDR $HTML_TBL_OBJ $ALVIS_HTML_OBJ); #=============================================================================== # Prototypes Section #=============================================================================== sub DoAction; sub My_Readdir; sub InitGlobals; sub ProcessArgs; sub Info {my ($mesg) = @_; print STDOUT "INFO: $mesg\n";} sub […]

Read more