bash - Getting the first paragraph of Wikipedia, and storing it into a text file -


i wanted make system in give search onto terminal of raspberry pi , pi gives voice output.

i've solved text-to-speech conversion problem using pico tts. wanted go wikipedia page of term searched, , store first paragraph of page text file.

for example, result input tiger in simple english should make text file containing -

the tiger (panthera tigris) carnivorous mammal. largest living member of cat family, felidae. lives in asia, india, bhutan, china , siberia.

i tried using this didn't seem work.

error message for

$ pip install wikipedia ... command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-qdtizy/wikipedia/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-9cpd6d-record/install-record.txt --single-version-externally-managed --compile failed error code 1 in /tmp/pip-build-qdtizy/wikipedia storing debug log failure in /home/pi/.pip/pip.log 

this seems work:

title=tiger n_sentences=2 curl -s http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles="$title"&exsentences="$n_sentences"&explaintext=&format=json |   sed 's/.*"extract":"\|"}}}}$//g' 

it correctly yields:

the tiger (panthera tigris) carnivorous mammal. largest living member of cat family, felidae.

also tested title=albert_einstein:

albert einstein (14 march 1879 \u2013 18 april 1955) german-born theoretical physicist developed general theory of relativity, 1 of 2 pillars of modern physics (alongside quantum mechanics).\nhe received nobel prize in physics in 1921, not relativity.

(note title="albert einstein", title=albert_einstein, , title=albert%20einstein don't work, you'll want command find best matching real simple.wikipedia article title.)

the curl command makes http request simple.wikipedia.org. see in action, try this:

curl http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles=tiger&exsentences=2&explaintext=&format=json  

the sed command extracts desired part of response.

updated increase chance of working raspberry's curl & sed: changed https http , rewrote sed command without -e.

ref:

mediawiki api?


Comments

Popular posts from this blog

sequelize.js - Sequelize group by with association includes id -

android - Robolectric "INTERNET permission is required" -

java - Android raising EPERM (Operation not permitted) when attempting to send UDP packet after network connection -