This is a simple extractor which extracts wikipedia articles of a given category.
gem install Wiki_Category_Extractor
- nokogiri
- open-uri
This gem provides a interface to extract wikipedia articles. The current version extracts the urls of the articles of the given seed category.
- Category is the string which contains the name of the seed category,
- Pagelimit is the no of pages to be extracted
Wiki_Category_Extractor.extract('Algorithms', 50)
Check out more categories - Medicine, Sports.
=> {"learn more"=>"", "Algorithm"=>"", "Template:Algorithm-begin"=>"", "Template:Algorithm-end"=>"", "List of algorithm general topics"=>"", "List of algorithms"=>"", "Adaptive algorithm"=>"", "Adaptive projected subgradient method"=>"", "Algorism"=>"", "Algorithm characterizations"=>"", "Algorithm design"=>"", "Algorithm examples"=>"", "Algorithmics"=>"", "The Art of Computer Programming"=>"", "Biologically inspired algorithms"=>"", "Bisection (software engineering)"=>"", "British Museum algorithm"=>"", "Chandy-Misra-Haas Algorithm:Resource Model"=>"", "Decrease and conquer"=>"", "Devex algorithm"=>"", "Divide and conquer algorithm"=>"", "Driver scheduling problem"=>"", "Generalized distributive law"=>"", "HAKMEM"=>"", "HCS clustering algorithm"=>"", "Holographic algorithm"=>"", "Hybrid algorithm"=>"", "Hyphenation algorithm"=>"", "In-place algorithm"=>"", "Kinodynamic planning"=>"", "Manhattan address algorithm"=>"", "Matching engine"=>"", "Maze generation algorithm"=>"", "Maze solving algorithm"=>"", "Medical algorithm"=>"", "One-pass algorithm"=>"", "Out-of-core algorithm"=>"", "Ping-pong scheme"=>"", "Pointer jumping"=>"", "Predictor–corrector method"=>"", "Randomization function"=>"", "Randomized rounding"=>"", "Rendezvous hashing"=>"", "Reservoir sampling"=>"", "Run to completion scheduling"=>"", "Run-time algorithm specialisation"=>"", "Sardinas–Patterson algorithm"=>"", "Sequential algorithm"=>"", "Shuffling algorithm"=>"", "Sieve of Eratosthenes"=>"", "Simulation algorithms for atomic DEVS"=>"", "Simulation algorithms for coupled DEVS"=>"", "Spreading activation"=>"", "Streaming algorithm"=>"", "Super-recursive algorithm"=>"", "Temporal anti-aliasing"=>"", "Timeline of algorithms"=>"", "Tomasulo algorithm"=>"", "Hindley–Milner type system"=>"", "XOR swap algorithm"=>""}