In a part of a technical assessment I had to complete, I was asked to import data using Python; this sounds easy enough. But what about for a Google Sheet that has sheets within it? I decided to write a tutorial how to import the data from the Google Sheet into Google Colaboratory for data preparation and analysis.
Familiarity with the following:
NOTE: I will use “>>” as the output if there is one.
Give Access to all files in your Google Drive. You will need to click on the provided link after running the…
I wanted to write this article, because I remember wanting to quickly jump into making a model when I was first starting in my data science course with Flatiron School; however, it may (but not always!) affect your model’s performance and training stability. I won’t discuss which models rely on the assumption that features are normally distributed or, in other words, have that infamous bell curve you have probably heard about. Instead, I will show you what it looks like to transform left and right skewed data to a normal distribution if you need it for your model.
As a new member to the data science community, I am always on the lookout for new ways to keep my technical knowledge sharpened and further improve my coding skills. LeetCode was suggested to me by the staff at Flatiron school. However, I have decided to stick more to HackerRank which is similar to LeetCode, but it is not because one is necessarily better than the other. Let us talk about LeetCode.
If you are not familiar with LeetCode, it is a website that has coding challenges in algorithms, database, shell, concurrency, interview questions from companies such as Google, opportunities…
Last week, I received an email invitation to participate in a technical assessment. In the email, I was provided some background information about the technical assessment before I had to take it. Basically, I was asked to perform a data merging task linking together different data, and the assessment would also take place on a website called HackerRank. I had the choice to use Python or any other language. …
NOTE: Skip to PySpark Installation Instructions in Google Colab if you are in a rush
You are working on a project in a Jupyter Notebook. You are exciting to see what comes out of this new analysis. Perhaps you read about a new modeling or data cleaning approach. You are progressing through the data lifecycle, but you find yourself sitting there waiting for that little asterisk to change back to normal which signifies the computation is complete.
Last week, in Part 1, we learned how to use the OMDB API step-by-step and automate our requests to the API to mine for all the movie data we need. In Part 2 we will continue on to see how we can improve our defined function from Part 1 and capture more movies. Finally, we will integrate the new OMDb data with the movie budget dataframe.
From Part 1, we saw the below defined function:
def moviesDict(movieDetails, movieInfo2, OrgMovieTitle):
if len(movieDetails) > len(movieInfo2): #This fills in fields that might not have been present in template…
If you are rushed, just scroll down to OMDb API Step-by-Step
About every weekend, my girlfriend and I head over to my parents’ house to have dinner together, visit, and watch a movie together. This weekend however, we had great difficulty finding a movie we all considered a good movie; we shifted through different recommendation lists we found online. We looked at three main features:
While going through my GitHub profile, I ran across my first ever project from Flatiron School’s data science program; it was a rudimentary analysis of the tragedy, Macbeth, but I remember my feeling of accomplishment when I completed it and how exciting it was to see data science applied to something non-conventional. Below, I would like to revisit the project and take you on the journey that I embarked on more than 1 year ago.
NOTE: I will be using lists, dictionaries, conditionals, and matplotlib to visualize the data from the play, so be excited and prepared to see all…
When you hear MLA, you may remember using the MLA format for essays or literary work you did in your English classes from high school or college. “So, what does this have to do with data science?” Well, let us look at the reason why MLA was founded:
“Founded in 1883, the Modern Language Association of America provides opportunities for its members to share their scholarly findings and teaching experiences with colleagues and to discuss trends in the academy.”
Interesting, so now let us look at how IBM defines data science:
“Data science is a multidisciplinary approach to extracting actionable…
An intraoperative neuromonitor who tinkers with data to see what interesting nuggets he can find.