Databases let us store information in an organized structure. Typically in the form of tables with rows and columns. Below is an example table with 3 rows and 5 columns that contains user information.

+---------+------------+-------------+-----------+---------------+
| User Id | First Name | Middle Name | Last Name | Date of Birth |
+---------+------------+-------------+-----------+---------------+
| 1 | Carice | van | Houten | 05-09-1938 |
| 2 | Gulfaraz | | Rahman | 23-03-1990 |
| 3 | Elon | Reeve | Musk | 12-10-1979 |
+---------+------------+-------------+-----------+---------------+

Structure is achieved in the form of a table with rows and columns as…


Last week, I received a request to transcribe 21,000 passports and national identity documents. My lack of patience and passion to read identity cards for any number of hours drove me to write a script to solve this tedious task.

The fun began immediately — when I browsed the dataset,

  1. The documents are from multiple countries and so contain text from different languages.
  2. The dataset is made of photographs of the documents and not scans with consistent dimensions.

The Dataset

A photograph of a hand holding a Dutch driving license.
A photograph of a hand holding a Dutch driving license.
An example input image from which to extract text. Source

The photograph of a driving license from the Netherlands is a good sample from the dataset. It is a good sample because…


What is a ‘Snapshot’?

According to Urban Dictionary, a snapshot is what people that aren’t up to date call screenshots.

Duh… totally clear. To a Django developer, a snapshot looks like,

snapshots['SnapshotTests::test_choices_list_zero 1'] = { 'count': 0, 'next': None, 'previous': None, 'results': [] }

Let’s read the above python code, snapshots is a dict with key ‘SnapshotTests::test_choices_list_zero 1’ . This key is assigned a JSON object value {…}. This JSON object is the response of the API endpoint GET /api/choices/ . This JSON object is a snapshot of the API response.

If we assume this JSON object is the expected…


Any application or script needs logging to share information with the user. The simplest logging method is the print function to write temporary logs. A permanent log file is a more general and valuable solution.

Source: xkcd

A log file is a text document.

An application can be seen as a room filled with people, where each person represents a module in the application. Without any logging, the room is in absolute silence which, more often than not, can drive a person mad.

Code generates logs to communicate with people.

All applications benefit from good logging. In a bad logging scheme, the…


Deep Learning is used to solve a wide range of problems by providing samples of input data along with the expected output from the neural networks. The learning is driven by concrete mathematical rules used systematically to adjust weights of the network — think of these as a collection of numerical knobs adjusted to produce the desired output. What the network actually learns are these numbers which transform an input question into an output answer. What do these numbers signify? Do they represent knowledge which can be interpreted and understood by humans?

Bombe — Wikimedia.

Let us begin at the two of the…


Plot Detection is the problem of identifying the boundaries of farms in images. This is a complex problem which needs to address issues ranging from geometrical restrictions, regional contrast and illumination variances.

Problem scope in the high-level overview.

We (very naively) applied the Nimbus pipeline to detecting boundaries of farms in satellite images. This article describes only the changes in the solution implemented for the noise removal process. Read the following article for a detailed report,

The Dataset

Only one training datapoint of size 12566 x 10120 (also used for hyper-parameter tuning), one test datapoint of size 8921 x 7185 and one image without marking for visual inspection…


Nimbus is built to remove clouds and the shadows cast by these clouds in Sentinel-2 satellite images gathered from the Copernicus portal. We consider clouds as noise that needs to be removed in order to monitor agricultural land.

Problem scope in the high-level overview.

The Problem

Find and remove clouds and their shadows on satellite images. Formulated as a pattern recognition problem, we use deep learning on a small annotated set of satellite images to train a model to remove clouds and accompanying shadows.

Follow the link below to understand why we want to solve this problem,

The Dataset

The training data consists of 21 manually labelled pairs of images…


4,857 is the number of satellites orbiting the earth. 1,980 of which are active serving communications, observation, navigation, and science. These satellites in consolidation provide access to the earth surface like never before. Providing access to data which can reshape industries at equivalent scales. Combining this seemingly infinite stream of raw coverage with the pattern recognition abilities of computing networks we can extract information from remote locations which would be expensive to access by traditional means.

Image courtesy of the European Space Agency.

Financing in the agricultural sector is challenging especially with limited information. Lack of information to confidently finance remote locations deters growth in those areas…

Gulfaraz Rahman

I write about problems and technology.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store