Getting PostgreSQL data from AWS

Amazon RDS for hosting your production (or demo) database servers is great, because it is managed, which means that many minute but crucial things (such as backups, for example) are handled automatically.

However, sometimes it is necessary to try some part of production data on your local development machine.

If you try a simple Google search, you'll probably end up on StackOverflow and see a suggestion that goes something like this:

$ pg_dumpall -p 5432 > /path/to/my/dump_file.sql

This, however, is not the best solution as it dumps everything whereas we are only interested in one part of the data. Moreover, to make this approach work, we have to connect to the remote DB host via SSH and then organize the transfer of `dump_file.sql` to your local machine. There must be a better way.

One such a way is to use `psql` utility client that allows executing commands on remote PostgreSQL hosts.

First, we need to connect to the remote instance:

$ psql --host=proddb.our_region.rds.amazonaws.com --port=5432 --username=user_prod --password --dbname=prod_db

Once connected, we have a number of options. One option is to use the `COPY` command:

prod_db=# COPY (SELECT * FROM bt WHERE cr > '2012-05-01') TO '/tmp/file';

The problem with `COPY` is the `TO` argument? Where do you think the dump file will go? That's right, it will be stored on the remote host! We, however, need this file on our local machine, not on an RDS host. What can we do? Use the `\copy` command (note the backslash):

prod_db=# \copy (select * from feed_xml where create_date < '2018-08-30' order by create_date desc limit 10) To '/tmp/test.csv' With CSV HEADER

The commands look similar, but there is an important difference. The first `COPY` is a SQL command and it is executed by the remote engine. The second `\copy` is a `psql` command and it is executed on your local machine.

The last step is to import data into the locally running instance:

local_db=# \copy feed_xml from '/tmp/test.csv' With CSV HEADER

And that's it! No need to connect to a remote host using SSH or copy dump files manually.

Creating a Rancher-managed host on AWS

While Rancher 2 offers a nice wrapper around Kubernetes, Rancher 1 is still a viable alternative if case you need to get something deployed quickly and reliably.

Essentially, Rancher is just a Docker container running on an EC2 instance. It offers a fancy-looking Web UI, an API for external tools like CircleCI and an agent that orchestrates services on another host. This other host is running business microservices, usually on top of RancherOS and is usually also hosted on EC2.

The first stap is to create an EC2 instance for the Rancher agent. We can use the RancherOS image as it comes with Docker preinstalled and it is about the only thing Rancher agent needs. Creating a new EC2 instance should be relatively straightforward as there are tons of tutorials on the Internet describing how it's done. The important thing here is to give the instance enough space and make sure it will get an external IP address (you'll need it to access Rancher UI). It is also crucial to download the generated PEM file immediately as AWS offers it only once.

After we've created a new EC2 instance for the Rancher itself, we need to connect to a newly created EC2 instance via SSH using the downloaded PEM file.

$ ssh -i rancher.pem rancher@our-server

Rancher agent keeps all settings in its own MySQL database. It is usually a good idea to bind the MySQL data volume outside of the agent container so that we will be able to backup independently of the image:

docker run -d -v /home/rancher/mysql-data:/var/lib/mysql --restart=unless-stopped -p 80:8080 rancher/server:v1.6.21

If you check the Docker logs of this image, you'll see that Rancher agent is written in Java and it immediately starts initializing. Once it's initialized (usually takes about 30 seconds), you can fire up the browser and log into the UI.

The UI allows you to login, but you'll probably want to restrict access to it to only the team members of the company. In order to do that, you'll need to go to "Admin/Access control", open GitHub in another tab and add Rancher as a new OAuth app. Then go back to Rancher UI and fill application keys/secrets.

To test that everything works, it is recommended to log out and then log in authorizing youself with GitHub.

Rancher UI can create instances on AWS provided that it has access to AWS with necessary permissions. The good thing about creating instances via Rancher is that you can always download the PEM file for SSH.

Handling exceptions in Scala with monad transformers

A very common characteristic of Scala code is a large number of for expressions (also known as for comprehensions). And for a good reason: for expressions are straightforward to read and relatively easy to write as they usually boil down to only two methods - map and flatMap.

For example, this is a very typical Scala code that first grabs the ID of a user from one server, and then uses this ID to query additional data from another endpoint.

for {
  user <- getCurrentUserData()
  employment <- getUserEmployment(user.id)
} yield UserEmploymentInfo(user.name, employment.employer)

This code is so simple that even a person unfamiliar with programming (let alone Scala) will immediately guess what's going on here.

However, there is an important assumption that the above code makes. In order for "for expressions" to work, both functions must return monadic-like structures (you can read about it in detail here), which basically means that these structures must have correctly-defined methods `map` and `flatMap`. This is usually not a problem in Scala, as many useful types such as `Option`, `Try`, `Either`, `Future` have these methods.

However, what will happen if the result type of a little more complicated, say, Future[Try[_]]? In this case, the simple "for expression" will not work as it can only extract the outer-most value.

It turns out, however, that this use case is so common and there are generic solutions as well. Several functional Scala libraries such Cats and ScalaZ expose these solutions as so-called monad transformers.

However, if you're only using monadic structures from the standard library, there is an easier way - a small library called Scala Hamsters:

val result = for {
  user <- FutureTry(getCurrentUserData())
  employment <- FutureTry(getUserEmployment(user.id))
} yield UserEmploymentInfo(user.name, employment.employer)
result.wrapped

The only noticeable difference is that now end up with an instance of `FutureTry` and therefore, we must unwrap it at the very end.

The technology balance

When it comes to choosing technology for a new project, it is often tempting to choose the latest and the brightest stack. On the other hand, many businesses tend to fall back to something proven, well-known and usually outdated. In my opinion, neither approach is perfect and as engineers we always need to find the perfect balance between innovation and stability.

To illustrate the point, let's think about hardware. If you look at the price range of the latest CPUs, you'll notice that once you get to high-end selection the price starts to increase exponentially. The actual processing speed gets better as well, but only gradually. This makes perfect sense, because the technological process of manufacturing new chips is not yet established and production lines don't work as efficient. Inevitably, after several years, as manufacturers discover new (cheaper) ways to increase the output efficiency, the price of high-end models drops dramatically.

A similar thing happens in the software industry, but there is an important difference. Whereas a high-end CPU is definitely better than its cheaper counterparts, "new and shiny" technologies are routinely proven as anti-patterns only months (sometimes years) after the initial craze is subsided.

It is especially true on the frontend where new stuff gets released almost every day and deemed outdated or "wrong" just as quickly. A good example of this phenomenon is CoffeeScript, which came out several years ago and tried to fix some of the bad parts of JavaScript while introducing some novel features along the way. In 2018, it's not even considered a viable choice anymore and it makes certain big projects (such as Atom), which doubled down on CoffeeScript a while back, pretty awkward.

In this day and age, it is surprisingly easy to end up with a legacy frontend application even before it's finished. One European bank, for example, recently finished revamping its UI using AngularJS only to realize that Google already deprecated it and the new version of the framework is a complete rewrite. Ouch.

This logically brings us to the question of outsourcing. Can you outsource the creation of a Web-site? Absolutely, and as long as you are perfectly happy with mostly static content on the home page and the simplicity of overall design, the WordPress-based site will work perfectly well. The same reasoning can be applied to so-called "throwaway prototyping", a phrase coined by Fred Brooks in his well-known book "The Mythical Man-Month". However, there is an important caveat. As Fred Brooks himself admitted recently, the idea of creating something with the intention of throwing it away is wrong, because it's overly simplistic. And indeed, in this scenario, you run the risk of creating a completely dysfunctional system that doesn't present much opportunity to learn anything from it. As a result, as Ward Cunningham suggests here, you'll have to build the prototype twice, probably with disastrous consequences for the business.

What should we do then? In my opinion, we should try to create a reasonably flexible system that works well under a reasonable load. This flexibility will enable us to adopt it to future changes while serving perfectly well the current need of the business and its customers. For many startups, the reasonable load could be just about 10,000 concurrent connection and, quite honestly, with modern hardware, this is pretty easy to handle.

This is one of the reasons I like Scala and React. These technologies are relatively stable and give the developers an enormous amount of flexibility. Scala, in particular, is a very good choice for a backend language as it allows programmers to move very quickly while preserving type information and keeping accidental complexity to almost zero. The Play framework makes a great choice as a foundation for a backend stack as it has a ready-to-use solution for pretty much any problem out of the box. While Java is making a good progress at becoming a more productive language, Scala is still miles ahead which is well illustrated here.

Similarly, React is a very capable library and, unlike some other frontend solutions, its API has been stable almost since the inception. The ecosystem is also quite enormous and there are usually several good choices for any task. And the final argument is React's great support for server-side rendering, which we successfully employed in our projects (one such example can seen here).

A Little About Us......

A Little About Us......

Two Up Labs was formed to help deliver and execute upon new business ideas. We are a high performing, no bullshit rapid delivery capability. We are not the cheapest in the market. We are not an outsourced body shop. We build businesses and business models. We do proper product management and development. We like good design, but sometimes we sacrifice "looks" for backend functionality to get to market. We are lean and we are keen. We always deliver. We are thinking about corporate innovation and incubation...we don't mind sweat equity..but only if we are in control. To be honest we prefer to be paid, because we know there are a lot of tyre kickers out there who have grand ideas..but no idea :) 

Full Stack Engineer

woc-job-full-stack-developer.png

As a Full Stack Software Engineer  you'll be responsible for design, build, test and deployment of multiple projects Two Up Labs is engaged in. You will be solely responsible for the AWS infrastructure and backend integration works utilising Scala, Docker, NodeJs, Rancher and CircleCI. You will be working with a ReactJS Front End Engineer, however you are welcome to contribute to the website changes and may be required to from time to time. You'll need to have at least 5+ years experience and prior exposure to at full end-to-end deployment.

 

This role is for you if:

 

You are familiar with a wide variety of software development methodologies, tools, languages and approaches.

You are self-motivated and disciplined as the role allows remote work.

You are a pragmatic, fast and a practical coder where business deliverables and deadlines come before technical research.

You are excited by the prospect of working with a microservice architecture, a DevOps culture of collaboration, automation and monitoring all KPI's to ensure a stable and secure web product.

You take pride in producing clean, well­ tested code.

You enjoy a collaborative and team culture, are willing to voice your opinions and share learnings with colleagues, and are happy to work together to achieve the best results.

 

Requirements:

 

You have 5+ years work experience in Java, Scala, NodeJs, AWS, a strong interest in other languages, and can recommend which languages are better suited for the job.

Familiarity with complex web service based architectures.

Good grounding in distributed system design, domain modelling and several SQL datastore.

You are able to code, package and deliver the whole product end-to-end.

Excellent communicator, able to gather requirements from stakeholders, deliver product and demonstrate results in "show-and-tell" sessions.

 

Great to have:

 

Scalaz, Akka, AWS, PostgreSQL

Continuous Delivery, ReactJS, DevOps, BDD / TDD.

Trading, financial, accounting or transactional system experience.

Lean product development and startup experience.

 

Be prepared to pass out Hackertest and do it well and properly. 

To apply contact: dp@twouplabs.com