From data scooping to facial recognition, Amazon’s latest additions give devs new, wide-ranging powers in the cloud
Here are 10 new services that show how Amazon is redefining what computing in the cloud can be.
Anyone who has done much data science knows it’s often more challenging to collect data than it is to perform analysis. Gathering data and putting it into a standard data format is often more than 90 percent of the job.
Glue is a new collection of Python scripts that automatically crawls your data sources to collect data, apply any necessary transforms, and stick it in Amazon’s cloud. It reaches into your data sources, snagging data using all the standard acronyms, like JSON, CSV, and JDBC. Once it grabs the data, it can analyze the schema and make suggestions.
The Python layer is interesting because you can use it without writing or understanding Python—although it certainly helps if you want to customize what’s going on. Glue will run these jobs as needed to keep all the data flowing. It won’t think for you, but it will juggle many of the details, leaving you to think about the big picture.
Field Programmable Gate Arrays have long been a secret weapon of hardware designers. Anyone who needs a special chip can build one out of software. There’s no need to build custom masks or fret over fitting all the transistors into the smallest amount of silicon. An FPGA takes your software description of how the transistors should work and rewires itself to act like a real chip.
Amazon’s new AWS EC2 F1 brings the power of FGPA to the cloud. If you have highly structured and repetitive computing to do, an EC2 F1 instance is for you. With EC2 F1, you can create a software description of a hypothetical chip and compile it down to a tiny number of gates that will compute the answer in the shortest amount of time. The only thing faster is etching the transistors in real silicon.
Who might need this? Bitcoin miners compute the same cryptographically secure hash function a bazillion times each day, which is why many bitcoin miners use FPGAs to speed up the search. Anyone with a similar compact, repetitive algorithm you can write into silicon, the FPGA instance lets you rent machines to do it now. The biggest winners are those who need to run calculations that don’t map easily onto standard instruction sets—for example, when you’re dealing with bit-level functions and other nonstandard, nonarithmetic calculations. If you’re simply adding a column of numbers, the standard instances are better for you. But for some, EC2 with FGPA might be a big win.
As Docker eats its way into the stack, Amazon is trying to make it easier for anyone to run Docker instances anywhere, anytime. Blox is designed to juggle the clusters of instances so that the optimum number are running—no more, no less.
Blox is event driven, so it’s a bit simpler to write the logic. You don’t need to constantly poll the machines to see what they’re running. They all report back, so the right number can run. Blox is also open source, which makes it easier to reuse Blox outside of the Amazon cloud, if you should need to do so.
Monitoring the efficiency and load of your instances used to be simply another job. If you wanted your cluster to work smoothly, you had to write the code to track everything. Many people brought in third parties with impressive suites of tools. Now Amazon’s X-Ray is offering to do much of the work for you. It’s competing with many third-party tools for watching your stack.
When a website gets a request for data, X-Ray traces as it as flows your network of machines and services. Then X-Ray will aggregate the data from multiple instances, regions, and zones so that you can stop in one place to flag a recalcitrant server or a wedged database. You can watch your vast empire with only one page.
Rekognition is a new AWS tool aimed at image work. If you want your app to do more than store images, Rekognition will chew through images searching for objects and faces using some of the best-known and tested machine vision and neural-network algorithms. There’s no need to spend years learning the science; you simply point the algorithm at an image stored in Amazon’s cloud, and voilà, you get a list of objects and a confidence score that ranks how likely the answer is correct. You pay per image.
The algorithms are heavily tuned for facial recognition. The algorithms will flag faces, then compare them to each other and references images to help you identify them. Your application can store the meta information about the faces for later processing. Once you put a name to the metadata, your app will find people wherever they appear. Identification is only the beginning. Is someone smiling? Are their eyes closed? The service will deliver the answer, so you don’t need to get your fingers dirty with pixels. If you want to use impressive machine vision, Amazon will charge you not by the click but by the glance at each image.
Working with Amazon’s S3 has always been simple. If you want a data structure, you request it and S3 looks for the part you want. Amazon’s Athena now makes it much simpler. It will run the queries on S3, so you don’t need to write the looping code yourself. Yes, we’ve become too lazy to write loops.
Athena uses SQL syntax, which should make database admins happy. Amazon will charge you for every byte that Athena churns through while looking for your answer. But don’t get too worried about the meter running out of control because the price is only $5 per terabyte. That’s about 50 billionths of a cent per byte. It makes the penny candy stores look expensive.
The original idea of a content delivery network was to speed up the delivery of simple files like JPG images and CSS files by pushing out copies to a vast array of content servers parked near the edges of the Internet. Amazon is taking this a step further by letting us push Node.js code out to these edges where they will run and respond. Your code won’t sit on one central server waiting for the requests to poke along the backbone from people around the world. It will clone itself, so it can respond in microseconds without being impeded by all that network latency.
Amazon will bill your code only when it’s running. You won’t need to set up separate instances or rent out full machines to keep the service up. It is currently in a closed test, and you must apply to get your code in their stack.
If you want some kind of physical control of your data, the cloud isn’t for you. The power and reassurance that comes from touching the hard drive, DVD-ROM, or thumb drive holding your data isn’t available to you in the cloud. Where is my data exactly? How can I get it? How can I make a backup copy? The cloud makes anyone who cares about these things break out in cold sweats.
The Snowball Edge is a box filled with data that can be delivered anywhere you want. It even has a shipping label that’s really an E-Ink display exactly like Amazon puts on a Kindle. When you want a copy of massive amounts of data that you’ve stored in Amazon’s cloud, Amazon will copy it to the box and ship the box to wherever you are. (The documentation doesn’t say whether Prime members get free shipping.)
Snowball Edge serves a practical purpose. Many developers have collected large blocks of data through cloud applications and downloading these blocks across the open internet is far too slow. If Amazon wants to attract large data-processing jobs, it needs to make it easier to get large volumes of data out of the system.
If you’ve accumulated an exabyte of data that you need somewhere else for processing, Amazon has a bigger version called Snowmobile that’s built into an 18-wheel truck complete with GPS tracking.
Oh, it’s worth noting that the boxes aren’t dumb storage boxes. They can run arbitrary Node.js code too so you can search, filter, or analyze … just in case.
Once you’ve amassed a list of customers, members, or subscribers, there will be times when you want to push a message out to them. Perhaps you’ve updated your app or want to convey a special offer. You could blast an email to everyone on your list, but that’s a step above spam. A better solution is to target your message, and Amazon’s new Pinpoint tool offers the infrastructure to make that simpler.
You’ll need to integrate some code with your app. Once you’ve done that, Pinpoint helps you send out the messages when your users seem ready to receive them. Once you’re done with a so-called targeted campaign, Pinpoint will collect and report data about the level of engagement with your campaign, so you can tune your targeting efforts in the future.
Who gets the last word? Your app can, if you use Polly, the latest generation of speech synthesis. In goes text and out comes sound—sound waves that form words that our ears can hear, all the better to make audio interfaces for the internet of things.
SETA is an official Amazon AWS Solution Provider with a team of AWS certified solution architects and Cloud engineers standing by to assist you with all of your Cloud Services needs.
To get started, contact SETA to have our experienced team assist you in getting started with these service offerings.