Come on Python! Just Read My Clipboard

I don’t know whether you guys have been annoyed by the MyStatLab powered by Pearson. These days I was busy doing homework on it. My homework is about statistics analysis and it requires to use a lot of work on RStudio. However, as everybody know, R doesn’t support xls format very well, which requires users to transform the xls file into csv file first.

It might not seem to be a big deal as long as you have the csv file that is quite ubiquitous among data world. Nevertheless, MyStatLab doesn’t provide the csv format and give users the access of xls format instead! It gives the user to options: one is to copy the datasheet to the clipboard; the other is to download it as an Excel file.

This was totally nightmare when I tried to do my work on MyStatLab. Since I want to avoid uncertainties of reading a txt file in RStudio(which means I have to copy and paste the data to a txt file and you know we can’t simply modify the “.txt” to “.csv” because it won’t work!), I have to download the Excel file, open it in Excel, save it as a csv file, read and open it in RStudio. Those are all repetitive work and quite time consuming and not pleasing.

Screen Shot 2015-09-14 at 10.28.18 PM

After finishing my job painfully and stupidly, I calmed down and tried to solve it in Python. I hate all repetitive work and I believe machine should have done it better. Therefore, I wrote a small program to read my clipboard and save it as a csv file.

P.S. I did consider to read the excel file and run a python program save it as a csv file but I don’t see the boost in efficiency comparing to copy the data and run the python scripts. Another reason to do that is because creating a csv format file is supported originally by python.

Here are the codes:

I refer to and it is quite useful and block the Tkinter window to pop up automatically. Also, I added the support for UTF-8 encoding because I am afraid that I might use it to convert some csv files where headers would contain some Chinese characters.

Something I like about SAS

I’ve been active in learning SAS for about a week. During this time, I found out something about SAS that really make it on top of statistical software for a long time. I want to talk about my feeling about it here, as a reminder.

First of all, I want to ask a question: what softwares would you think of when you heard other people talk about statistics? Answers may vary: R, Python, SAS, STATA, SPSS……. To be honest, I have never touched on one of them and that is SAS. Due to my major in undergraduate study,  I used STATA quite a lot and even my whole paper counted on it. Additionally, I have tried to broaden my vision about statistical software, so I have installed and played with R Studio and SPSS before. Except for the various capabilities of R, I am not stunned by any kind of statistical softwares. At least that’s my initial impression. I love python very much, but I have not much experience in the statistics-related packages such as Numpy.

Until I am forced to learn SAS can I find out the most interesting point in SAS — it separates the data input procedure and data analysis procedure with DATA and PROC command! I appreciate this setting very much because when the command file is becoming chunkier, this setting would help you recognise what you want to navigate to more efficiently. At first, it has some learning curve but once you get the hand of that, you will treasure the clarity it gives you.

When I wrote my undergraduate paper using STATA, I thought of it as a python-based statistical software because both of them have similar coding logic. It is simple but is still not easy for a non-programmer to understand at once. As for R, it simplifies a lot of the coding command and makes it easier for statistician to code, but it has way harder learning process than SAS does because it is more versatile and contains a lot of unnecessary packages for researcher. It’s true that people have different preference on their tastes, so that’s also why there are many similar but different tools in one area.

Frankly speaking, the coding structure of SAS is not as efficient as that of R, for example, I have to type “RUN;” to make it run each time. What I like about is the designing of its logic that separates the data input and analysing process. Just think about it, what if Python could open every type of data files using “DATA” command, it would be of great convenience!

Generating unbreakable password ?

Last day when I was busy learning the SAS and R in the workshop provided by MSBA program in USC, I thought a lot about my next python program. All the program I wrote are driven by my interest in python, so until I found something interesting again, I would hardly write something notable.

I came across a post on, which inspired this short program. Unfortunately it was written in Chinese and the main idea is about the security of your password. To sum up, there are a lot of encryption rules and most of them are related to MD5(Message-Digest Algorithm), which generates hash value. However, due to the fast-growing technology, MD5-generated code could also be hacked using super computers. Many others would recommend a newer algorithm but in my opinion the hacking is a matter of time.

Therefore, what I am going to do is to enhance the difficulty of password hacking. I am going to give the password a random shuffle and a for loop to increase the complexity and time for breaking. Here is my simple idea:

During this example I found out that Sublime Text 3 surprisingly did not support “raw_input” function because it could not pop out a window for users to input the strings. It makes ST3 so annoying that I have to have terminal active all the time.

Overall, I don’t think there’s any kind of password that is unbreakable so security should be well considered every time you store it at some place. What’s more, we need to increase the time and cost for hacking, and also lower  the value of breaking our password. Protect yourself on the Internet!

Jikexueyuan Spider is one of my favorite websites to study coding on my own. One of its biggest advantages is the lessons are relatively short compared to long and boring lessons on some websites. However, the UI of has drawbacks in showing learner the connection between different classes. For example, I want to study HTML and CSS, but I cannot filter by lecturers or by time added, so that it is extremely hard to find out the correct sequence after this class(time added is only available in the course’s main page, not on the index page). What I could ONLY do is to select a specific category such as HTML, and face a number of unordered courses, making me frustrated which one to go after this course.

Therefore, I would like to design an application to retrieve all the related courses under a specific category, and automatically click the URL of each course in the background, and then retrieve the name and time added of each course. Finally, I want it to sort the course by time added and write the information down on a .txt file so that I can analyze conveniently. Here are the codes:

Use python to check my email inboxes

Last month I came across an idea to check my email inboxes using python. I did some research on the web and try to imitate the great solutions from other coders. Because I am currently using IMAP service in my main email accounts, I chose to use the “imaplib” module to achieve my expectation. Here are the codes:

I needed this to help me because in China is kind of difficult to check my Gmail account and I want a simpler way to check my inboxes without opening bulkier email clients.

My First Python Application

It’s been two months since I started learning python on my own. It all starts from the bottom of my heart, about what I really want to learn. To be honest, I genuinely feel that I’ve enjoyed this journey.

After reading related books and videos for quite a long time, I decided to get hands on the practice, which is more difficult as expected.

What drives me to write this small application is that I love watching sports games and I often checked a website, which is “”, to see what games are on live today. However, I found this action repetitive and a waste of time opening my website, selecting the bookmark and then scrolling down to the part which I want. I have tried several modules for web scraping and this time I chose to use BeautifulSoup to get the web page. Therefore, this module’s code is as follows:

It works pretty well and shows me what I need in the console. But after I came to the U.S., it started to raise an error saying the variable “today” is Nonetype. After debugging, I found it kind of funny because it is due to the time zone difference. Variable “i” here represents today in the U.S. while sometimes in China it is one day later. Obviously, the website “zhibo8” will delete all the live games information at the end of the day. So if I am lucky enough to run this application at night in Los Angeles, when it is tomorrow morning in China, I will encounter this error.

I plan to figure out a way to solve this problem later, and I am quite satisfied with my first executable python file after two months.