Welcome to my World


  • Home

  • Archives

  • About Me

Life Hacks:Manga Downloader

Posted on 2018-09-27

You love Mangas. You are searching for a way to read them offline. Congratulations, You just found a way.

We import the following packages

  1. requests - It gives us some of the functionality of the browser
  2. os - It lets us organize files
  3. bs4 - It would help us search the required HTML tag

Steps:

  1. We make a folder using os
  2. We use request to download the page
  3. We use bs4 to find the URL of imageFile
  4. We download the image using requests and write it
  5. We use bs4 to find next or previous image link and change URL accordingly.
1
os.makedirs('xkcd',exist_ok=True)

We make a new folder. If it already exists, no new folder is made.

1
2
res=requests.get(url)
res.raise_for_status()

We use requests to download the page.Second line stops executing program in the case of an error.

1
2
3
4
5
6
soup=bs4.BeautifulSoup(res.text, "html.parser")
comicElem=soup.select('#comic img')
if(comicElem==[]):
print("Couldn't find comic image")
else:
comicURL='http:'+comicElem[0].get('src')

We find Image URL

1
2
3
4
imageFile=open(os.path.join('xkcd',comicElem[0].get('alt')),'wb')
for chunk in res.iter_content(100000):
imageFile.write(chunk)
imageFile.close()

We save the image to the suitable folder path. We use os.path.join to get that path. Then we write the image.

The entire program for reference.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import requests,os,bs4

url='http://xkcd.com'
os.makedirs('xkcd',exist_ok=True)
while not url.endswith('#'):
print('Download the page %s.....'%(url))
res=requests.get(url)
res.raise_for_status()

soup=bs4.BeautifulSoup(res.text, "html.parser")
comicElem=soup.select('#comic img')
if(comicElem==[]):
print("Couldn't find comic image")
else:
comicURL='http:'+comicElem[0].get('src')
print('Downloading %s image %s...' %(comicElem[0].get('alt'),comicURL))
res=requests.get(comicURL)
res.raise_for_status()
imageFile=open(os.path.join('xkcd',comicElem[0].get('alt')),'wb')
for chunk in res.iter_content(100000):
imageFile.write(chunk)
imageFile.close()

prevLink=soup.select('a[rel="prev"]')[0]
url='http://xkcd.com'+prevLink.get('href')


print('Download complete')

Life Hacks:Phone/Email Extractor

Posted on 2018-09-27

You are working for a company and you are asked to find the Phone Number and Email address from a long Excel file. You start scrolling the document which goes on endlessly. You know you will have to do lots of overtime to finish the work. You look at the phone to see your son’s photo. You won’t be spending family time due to the Workload. But, Suddenly you have an idea.

Python makes it really easy to do this task using Regular Expressions. We are going to try to use Regular Expression.
Note: Reproducibility is the most important thing kept in mind while writing the program.

We import the following packages

  1. re - Regular Expression Package
  2. pyperclip - It lets us work with clipboard

Steps:

  1. Write Regular Expression syntax
  2. Paste the text using pyperclip
  3. Print the matched result

Regular Expression are very useful but they are not without their own downsides. It is important to set the syntax of ReX(Regular Expression) right or the whole thing would go wrong.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
phoneRegex = re.compile(r'''(
(\d{3}|\(\d{3}\)) #Area Code
(\s|-|\.) #Separator
(\d{4}|\d{3}) #middle 3 or 4 digits
(\s|-|\.) #Separator
(\d{4}) #Last 4 digits
)''',re.X)



emailRegex=re.compile(r'''(
[a-zA-Z0-9.+_%-]+ #Username
@ #@symbol
[a-zA-Z0-9.-]+ #domain name
(\.[a-zA-Z]{2,4}) #dot something(most likely com)
)''',re.X)

Next step can be modified in several ways. The most general way would be to copy the text and use pyperclip to access the clipboard.

Final step would be to print the result in desired way.

Thoughts : I definitely think it would be useful to me. I can modify it and make my web crawler. i can think of many scenario where it would be useful.

The whole program is written below for reference.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
## Phone number and email extractor

import pyperclip,re

# Pyperclip is used to paste the contents in the clipboard
# print(pyperclip.paste())

phoneRegex = re.compile(r'''(
(\d{3}|\(\d{3}\)) #Area Code
(\s|-|\.) #Separator
(\d{4}|\d{3}) #middle 3 or 4 digits
(\s|-|\.) #Separator
(\d{4}) #Last 4 digits
)''',re.X)


# print(phoneRegex.findall("easfgd 435-324-4323 anf 535-2452-3424 "))

emailRegex=re.compile(r'''(
[a-zA-Z0-9.+_%-]+ #Username
@ #@symbol
[a-zA-Z0-9.-]+ #domain name
(\.[a-zA-Z]{2,4}) #dot something(most likely com)
)''',re.X)

# print(emailRegex.findall("email is fdsdgr@gmai.com"))

text=str(pyperclip.paste())
matches=[]
for groups in phoneRegex.findall(text):
phoneNum='-'.join([groups[1],groups[3],groups[5]])
matches.append('Phone numbers')
matches.append(phoneNum)

for groups in emailRegex.findall(text):
matches.append("Emails")
matches.append(groups[0])

for match in matches:
print(match)

LIFE HACKS: The End

Posted on 2018-09-16

There are lot of tasks which can be automated using programming. Work which usually takes hours, if not days, can be done in minutes. Life Hacks explores some of those tasks and tries to automate them.

Every program is written in such a way that it easily be modified to do a wide range of applications. This is extremely important. The reason being writing a new program for a different task(which is usually the case) takes a long time.

Please contact me by email if there are any bugs.

Blog 0: The Start

Posted on 2018-09-16

It has begun !!

Machines have begun their invasion on mankind. It was foolish for us to ever think that we could control them, for we are just Humans.

In 1940s first Atom Bomb was tested. A scientist was heard saying “We have now obtained the power of Gods.”. We all were proud and terrified at the same time. We were proud that we were becoming stronger. We were terrified because we were not wise enough to become that strong. We deviated from the path set out for us by the Gods. We were delighted as were now able to choose a path by our free will. But that was not the only feeling. As we looked, all the paths were much more dangerous than the current one.

Hello World

Posted on 2018-09-16

Useful shell commands written below

Quick Start

Create a new post

1
2
$ hexo new "My New Post"
$ hexo new page "About Me"

More info: Writing

Run server

1
$ hexo server

More info: Server

Generate static files

1
$ hexo generate

More info: Generating

Deploy to github

1
$ hexo d -g

More info: Deployment

12

Sahil Bansal

15 posts
3 tags
© 2019 Sahil Bansal
Powered by Hexo
|
Theme — NexT.Muse v5.1.4