Fork me on GitHub

Blog

ConnectorDB 0.3.0?

Aug 17, 2017. | By: Daniel

Almost a year ago, the first alpha of 0.3.0 was released. This was an initial version of ConnectorDB, without any visualization or analysis capabilities. The alpha was basically a powerful REST API, without the ability to view any data.

Over the past year, an enormous amount of work was put into ConnectorDB’s visualization and analysis capabilities. Now, querying a stream automatically gives you a variety of automatic plots!

This post will list out some of the changes that have already happened, and several others that will be made in the next 2 weeks as the final 0.3.0 release is prepared.

Installing & Ease of Use

The original ConnectorDB could only be installed on a linux server, preferrably at a cloud hosting provider (such as DigitalOcean or Vultr). The underlying issue with such deployments is that it severely limits the number of people who can use ConnectorDB to those with extensive linux and IT experience.

With this release, ConnectorDB has full windows support. But that’s just the start. With this new release, the desktop version of ConnectorDB will automatically create and manage a ConnectorDB server on your own computer. This means that all you need to do is run the installer, and you’re good to go for personal data-logging!

Even the Android app will be able to sync only when connected to your home’s network - no need to have ConnectorDB running on an exposed internet server! All you need is your laptop, desktop, or raspberry pi.

Visualization

With this version of ConnectorDB, your data can be automatically plotted.

Visualization

ConnectorDB will display the relevant plots for any type of data it can read. If you have numeric data, it will display plots, if you have categorical data, it will display bar charts, if you have GPS data, it will display a map.

Analysis

For the first time 0.3.0, the dataset API is now directly accessible through the frontend - you can correlate streams with each other. And once again, the relevant plots are automatically shown. For example, if correlating a numeric stream (such as productivity or mood) with GPS coordinates, you get a map showing how you felt in different locations:

Analysis

Android

Finally, the Android app underwent a huge update. With this release, you will be able to input data in seconds directly from the app! The app also gathers more data than before, and will soon enable you to sync only when on your local wifi network.

What’s Next?

Work on ConnectorDB has stalled - this is due to a reason that deserves its own blog post (hint: AI is hard!). Further updates will come once I figure out the AI!

Progress Report: PipeScript

May 9, 2017. | By: Daniel

There were no recent updates in the repository, and one might be tempted to think that there was no progress towards the 0.3.0 release. While ConnectorDB work has slowed considerably in the past month due to other work, it has not stopped. The main reason that there were no commits is that PipeScript is getting a near-total rewrite, and moving past the technical hurdles of enabling the core new features and optimizations is very difficult. I will briefly describe the parts of new PipeScript here:

Expander Transforms

Up until now, transforms in PipeScript could only return up to 1 datapoint per input datapoint. With the new version of PipeScript, transforms labeled as “Expanders” will be able to “expand” the streams. Why is this needed? One of the core use cases of while loops in PipeScript is to aggregate by time. For example:

while(day==next:day, sum)

This transform would sum over datapoints which were from the same day. This works well for dense streams, where there are many datapoints per day, but days could be skipped when there are no datapoints! To combat this situation, an iterate transform will expand the stream:

iterate(day,sum)

The transform returns the sum for each day, but when a day is skipped, it inserts a dummy datapoint with value 0. Expander transforms in general permit more advanced processing of streams, which is why they are so important to the next version of PipeScript.

Filtering in SubTransforms

I personally really hated the fact that I needed to add if last to every transform to get a useful result. Currently, getting the mean of the queried datapoints, you need to write:

mean | if last

The reason for this is that currently PipeScript does not allow filters (tranforms that return a single answer) inside other transforms, so if the mean transform returned just one datapoint, using it in a while would not be possible:

while(day==next:day,mean)

I am happy to say that in this version of PipeScript, this limitation will be removed, and the sum/mean transforms will only return one answer. This decision is the most technically challenging of all changes to PipeScript, because transforms are permitted to peek forward into the stream. When using a hijaking transform (such as while, which hijacks its second argumentto use as a script instead), the transform needs to be able to peek forward through future values of a transform. For example:

map(weekday,if not last)

The above transform returns the second-to-last datapoint in each weekday. The last transform only knows that the datapoint is last by looking forward in its stream and checking if there are any datapoints there. Suppose that the last transform wants to look to the next datapoint on a Tuesday. This means that the map tranform might need to compute datapoints a whole week into the future to find the next point which is on a Tuesday. This leads to a lot of complexity, and is very difficult to get right!

This was the main source of the delay - I think I figured out how to do it correctly, but the code is yet to be written, so time will tell!

Speed

PipeScript is slow. The current version can do a sum of 1 million datapoints in a second on my laptop (i5 4200U), which is pathetic when taking into account how fast a sum can be done on the same processor when coded directly:

BenchmarkSum           1000000   1077 ns/op     328 B/op  10 allocs/op
BenchmarkWhile          500000   3842 ns/op    1045 B/op  35 allocs/op
BenchmarkRawSum      500000000   2.82 ns/op       0 B/op   0 allocs/op
BenchmarkRawWhile   1000000000   2.34 ns/op       0 B/op   0 allocs/op

The goal of the new PipeScript is to be able to do a sum of 10 million datapoints in a second, meaning that it must take at most 100ns per datapoint.

Much of the time in the old version was spent converting between types, using the duck library. The first optimization that was done in preparation for the new PipeScript was a new method of type conversion, which is over 10 times faster, and is called quack.

This by itself is a good start, but won’t bring the speed up to the goal. Notice that there 10 allocations per operation in a simple sum! My goal is to avoid allocations were possible, and currently it looks like the simple sum transform will not have any allocations during runtime! This still needs to be benchmarked, since the current code is basically placeholder for the future functionality.

That’s it

While there are a couple more features planned, the above are things I think are on track to be implemented soon. With this upgrade, PipeScript will become a much more powerful analysis language. Once this is finished, then only documentation and debugging are left for the 0.3.0 release of ConnectorDB!

ConnectorDB 0.3.0 Release Delayed

Mar 19, 2017. | By: Daniel

Unfortunately, the release of ConnectorDB 0.3.0 will be delayed by a couple weeks. This delay is due to three reasons:

  1. A publication deadline is approaching in my research, and therefore requires my full attention until the end of March. I will therefore not be able to do the final release prep for a Mar 25 release.
  2. If it were just the first reason, the deadline would have been moved to April 9. But I now think that April 17 might be a better release date. The reason is the analysis page. The analysis page allows correlating streams of data, and visualizing the resulting datasets. The analysis page has only existed for 2 weeks, and the focus was adding the basic features to make it useful. Today I tried the analysis capabilities on my data for the first time. My first thought was this is incredibly powerful… but my second thought was If only I could see X. I realized that with just a bit more work, it could be made so much better. I think it is worth waiting a few days to have these extra visualizations.
  3. I showed off the analysis features to a couple non-technical people. The main takeaway was that they loved what could be done, but they got lost with the PipeScript necessary for advanced analysis. While PipeScript’s power is necessary to enable truly deep analysis, enabling people to start using it would require a very good introduction. Therefore, it will be useful to have a series of “Data Analysis Tutorials” which iteratively introduces readers to more and more advanced data manipulation in the context of a full dataset.

In order to allow the time to handle points 2 and 3, it was decided that ConnectorDB 0.3.0 will be released on April 17. The extra week of time should be just barely enough to get those two done.

EDIT: The release might be delayed further. Adding more advanced analysis capabilities has proved to be a bit more complex than expected, even with the extra delay. In order to unleash the full analytical power that ConnectorDB is capable of, PipeScript itself needs an upgrade - and I don’t want to hurry something so critical to ConnectorDB.

In the mean time, you can download the latest pre-release build through the downloads page, but you have to promise to upgrade to the final release version in April!

Remember that if upgrading from versions of ConnectorDB older than 0.3.0-git.1077 (such as 0.3.0a1), you will need to export your data first, and reimport it into your new server. Click here for instructions. Versions newer than 0.3.0-git.1077 should be directly compatible with the final 0.3.0 release.

ConnectorDB 0.3.0 Release Date: March 25

Mar 7, 2017. | By: Daniel

Almost a year ago, the first alpha of 0.3.0 was released. This was an initial version of ConnectorDB, without any visualization or analysis capabilities. The alpha was basically a powerful REST API, without the ability to view any data.

Much has changed over the past year, and ConnectorDB has iteratively gained more and more visualization and analysis features. At this point, it is nearly ready for the full 0.3.0 release :D. Therefore, the release date was set for March 25, 2017.

At this point, some bugs are still being fixed and a couple visualizations are being added, but the functionality available right now is a good idea of what’s coming in 2 weeks.

Having said that, there are still several things that still need to be fixed up before the final release:

  • some bugfixes to the data gathering capabilities
  • android syncing on local network
  • some fixes to datasets
  • overall bugfixing
  • lots of docs and tutorials!

If you’re feeling adventurous, you can try the pre-release development builds, and the beta android app, but be aware that they might be buggy and have other issues. You’ll have to promise to update to the full version of ConnectorDB when it comes out!

ConnectorDB 0.3.0a1 Released!

Jun 7, 2016. | By: Daniel

ConnectorDB is now open-source! The first public alpha is now released. You can download it by using the download link on this site.

To learn a bit more about the current features of ConnectorDB, you can look at this blog post, which explains where we are right now.

What is ConnectorDB?

Jun 6, 2016. | By: Daniel

There already exist many apps and fitness trackers that gather and attempt to make sense of your data. Most of these services are isolated - your phone’s fitness tracking software knows nothing about your browser’s time-tracking extension. Furthermore, each app and service has its own method for downloading data (if they offer raw data at all!), which makes an all-encompassing analysis of life extremely tedious. ConnectorDB offers a self-hosted open-source alternative to these services. It allows every device you have to synchronize with one central database, which allows creating an in-depth picture of your life.

The ConnectorDB API allows you total access to your data, and is built specifically with Machine Learning in mind. Anyone with the technical knowledge can build and share programs that can extract insights from any data. To make Machine Learning easier, ConnectorDB also allows a supervision signal in the form of life ratings, enabling learning algorithms to optimize your goals directly.

In the future, having such a centralized database will enable truly smart devices. Just like there are many disparate apps for self-tracking, current IoT devices seem to be unaware of anything but their own sensors. With devices connected to a central life-server such as ConnectorDB, learning algorithms can be aware of everything. You would not need to control your lights with an app - a central AI would set them to the perfect intensity based on everything from your stress level to your schedule. ConnectorDB and its successors will enable our surroundings to automatically adjust themselves to hopefully make our lives easier.

What is the Mission behind ConnectorDB?

Jun 6, 2016. | By: Daniel

For thousands of years, human beings have improved their lives through observation, experiment, and logical thinking. From noticing which foods make us feel good to searching for the ideal sleep pattern, we are always striving to be healthier, happier, and more productive.

Sometimes we do a pretty good job at this task, but there are limits to what we can observe and analyze. Traditional experiments may take years to collect data only to run basic analyses and find the simplest of patterns.

Recent advances in data analysis and machine learning pave the way forward. Using powerful computers, these programs are sometimes able to find patterns and insights in datasets that defy traditional patterns of thinking.

The question at the heart of ConnectorDB is simple: what if we collected data about our lives and subjected it to the meticulous eye of our most powerful ML algorithms?

[Read More]

Recent Posts

Categories

Mission 2

Releases 3

Updates 1

This Site

Contribute

ConnectorDB is a very new open-source project. If you are a designer/developer or ML enthusiast, head on over to the connectordb github, where you can choose which part of ConnectorDB you want to contribute towards! Pull requests or bug reports are welcome!