A quick comparison of the Brazilian and Portuguese constitutions through Text Mining in R

Image for post
Image for post
Photo by Mikhail Pavstyuk on Unsplash

This is a quick and mostly visual comparison of some features of the current Brazilian and Portuguese Constitutions intended to leverage some Text Mining concepts and to look at some similarities and differences of the Constitutions.

What you’ll see here:

Methods

For this analysis, I used Text Mining concepts and packages in R, and the steps in a nutshell are:

After that, we get a dataframe that looks like…


Image for post
Image for post
Photo by National Cancer Institute on Unsplash

Text Mining and Natural Language Processing are two of the most interesting fields right now in Data Mining. Whether you are working with tweets from some controversial political candidate or going through books with hundreds of pages to discover some kind of pattern, there’s a lot you can do right now with the algorithms and packages available in R (or Python, for that matter).

For me, as a bookworm, one of the most interesting possibilities is to take one book and thoroughly analyze it, working mainly towards visualization and discovery of relevant information within the text. …


These functions will elevate your Exploratory Analysis to the next level

Image for post
Image for post
Photo by Carlos Muza on Unsplash

The EDA — Exploratory Data Analysis — phase of the Data Mining framework is one of the main activities when it comes to extracting information from a dataset. Whatever your ultimate goal is: Neural Networks, Statistical Analysis, or Machine Learning, everything should start with a good understanding and overview of the data you’re dealing with.

One of the main characteristics of an EDA is that it is a somewhat open process that depends on the toolbox and the inventiveness of the Data Scientist. …


Image for post
Image for post

For most of us, Google Web Search and the other main Google products are our weapons of choice whenever we need to find anything on the internet or in the real world. Whether it is to get up-to-the-minute news of the Covid pandemic, the latest scores of your favorite sport, or to find out how that tasty dish is made, Google is our #1 source of information right now.

One way of using some of the information Google has on us in our favor is with Google Trends. It makes it easier to discover trends and analyze the behavior of our customers and users in general. Google Trends is one of the best tools for knowledge discovery and to show in real-time (or almost) how relevant a subject is — at least in terms of Web searches and public interest. …


Image for post
Image for post
Photo by Panos Sakalakis on Unsplash

Battery life is one of the most important factors on a portable device such as a laptop, tablet, or cellphone, especially when yours has a few years under its belt.

Focusing here on laptops and on Windows Operating Systems, few people know that Microsoft included in Windows 7 and above a battery health report generating tool, where you can get a detailed status of your device’s battery health, comparison of design capacity x current capacity, amongst other relevant pieces of information.

It’s easy, it’s quick and you can definitely extract something out of it whether you are a tech-savvy person or not. It works on Windows 7, 8, and 10 (and its sub-versions), and all you need is to run one single line of command in the Command Prompt of your computer. …


Image for post
Image for post
Photo by NordWood Themes on Unsplash

I’m an aspiring Data Scientist and I like playing with data as much as the next guy, but one thing that can be very frustrating it’s waiting around for a dataset to be loaded so you can start doing your thing in R. While I realize that’s not always the case, I can confidently say most, if not all Data Scientists have struggled with this conundrum at least a few times. That’s why I decided to do a quick comparison of 3 of the main methods to load CSV files in R programming language.

The functions I’m gonna compare are three very well-known functions used to import CSV files into R as…


Image for post
Image for post

A crescente quantidade de dados que geramos e consumimos hoje fez da Visualização de Dados uma área crucial da ciência moderna. Através do uso de elementos visuais como gráficos, tabelas e mapas, as ferramentas de visualização de dados propiciam uma maneira acessível de observar e compreender tendências e padrões e identificar pontos fora da curva.

Como uma área tão dependente dos estímulos visuais, é importante entender como a mente humana processa as informações a seu redor, para decidir como melhor direcionar o foco e aproveitar o intervalo de atenção do usuário, que hoje é mais curto do que nunca.

Alguns dos estudos mais influentes sobre o assunto foram realizados pela Escola Alemã de Psicologia Experimental no início do século XX. Tais estudos deram origem à escola de Psicologia chamada Psicologia Gestalt, que teoriza, entre outras descobertas, que o cérebro humano é programado para ver estrutura, lógica e padrões mesmo onde não há. Ainda segundo estes, sempre buscamos um sentido, uma conexão de causalidade e categorizar estímulos visuais como semelhantes através de vários tipos de conexões. …


Image for post
Image for post
Photo by sgcdesignco on Unsplash

A few days ago I ran into this quite underrated and unknown package called SpotifyR, which pulls data from — you guessed it — Spotify. Although R undeservingly gets a bad reputation especially when compared to Python or other “cleaner” programming languages, I find it fun to use and fairly quick to process data when we’re not working with huge dataframes and thousands of data points.
Let’s dive into SpotifyR and find you what exactly it can and can’t do. As described by its developer Charlie Thompson:

SpotifyR is an R wrapper for pulling track audio features and other information from Spotify’s Web API (…) it allows you to enter an artist’s name and retrieve their entire discography in seconds, along with Spotify’s audio features and track/album popularity metrics. …


Image for post
Image for post
Photo by Avel Chuklanov on Unsplash

Aprender um idioma pode ser uma tarefa tediosa e demorada, mas com tantas ferramentas disponíveis gratuitamente on-line hoje, não precisa ser assim. Aqui estão algumas das principais que já encontrei, para todos os gostos, e que permitem melhorar várias aspectos, desde a compreensão, escrita e até a pronúncia. Nenhuma sozinha levará alguém do zero à fluência, mas é importante usar o que funciona melhor para você e praticar sempre que possível.

Obs: fiz uma mini-legenda com esses símbolos, representando cada uma das habilidades que podem ser treinadas com cada uma das ferramentas. …


Image for post
Image for post
The Kansas City Chiefs, champions of Super Bowl 54

The Super Bowl is the popular name for the championship game of the National Football League (NFL). It’s annually the most-watched single-day sporting event in the world. Unlike other professional sports leagues that decide their champion in a series of games (usually best of 7), the NFL decides its champion with one single game, so there’s a sense of urgency and is usually as thrilling as it can be. …

About

Rafael Belokurows

Aspiring Data Scientist, compulsive reader and just hoping to survive the global pandemic

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store