Unlike many other fields data-science projects should start with focused questions. A table of data is not of much use in itself unless we analyse and understand it with a specific goal in mind. Without a predefined goal we would not know where to start – what to look for in the data, how to analyze it, what extraneous elements should be removed etc.
Category: data
Yearly medicine dosage visualization
For the past several years I’ve regularly consumed Pantoprazole – a proton pump inhibitor (PPI) – to control my acid reflux. However, as it is well know, regular PPI use causes various health problems. I’ll not enumerate those here, but trust me when I say that they are numerous. Last year I decided to wean myself off PPIs slowly, by only limiting to a 20mg dose on any single day, and alternating or dropping some days completely. I recorded my dosage information in Google calendar on my mobile, so that I could process it later. Now that I’ve a more than a year’s data with me I thought of visualizing it to get an understanding of my dosage habits.
Read More
Visualizing missing data in databases
Missing data in databases can cause bugs in applications or incorrect calculations. Recently, while working on a RETS application, I needed to ensure that not many missing values were encountered in one of the MySQL tables. Although one could easily write a SQL query to find the percentage of missing values, I many times find it easier to first get a visual representation of the amount of missing data there is in the table, and then drill-down further if required. One library that I found that lets you easily get a visual representation of missing data in your database tables is missingno – a Python library.
Read More
Pitfalls of assigning a wrong data type to a database column
A recent debugging session on a web application surfaced a recurrent issue in database design – that of assigning a wrong data type to a database field.
Read More
Statistical distribution of column values in MySQL
Many times we need to get a statistical distribution of values in a database table. Say you have a e-commerce shoe store having a product table with the following fields and values. As this is only an example I’ve limited the table to a few items; there will hundreds of rows in a real-life table.
Read More

Using the TOML configuration format in your applications
As any one who has programmed knows about configuration files. Configuration files are mostly text files used to configure the parameters and initial settings for computer programs – mostly user applications, operating system settings. Below is a small list of frequently used file formats.
– XML
– Windows INI
– YAML
– JSON
– toml
Read More