Our big data and analytics practice is rapidly growing. Jeremy Bendat and Laith Al-Saadoon provide a behind the scenes look at some of the cutting edge projects we are completing on the big data front and the future of analytics.
Laith joined CorpInfo a little over a year ago and since that time he has gone from working on Office 365 migrations to projects that involve Big Data clusters, Redshift, and terabytes and petabytes of data all on AWS. Laith’s nickname on the team is the “surgeon” because he got his bachelor degree in bio-medical sciences and originally had plans to go into medical school. Laith always had a passion and love for technology – and was always that “computer guy” with his friends, so he decided to take his career in a different direction and got into technology with CorpInfo. We like to think Laith gives our clients that surgeon-level of precision on their projects.
Tell me about some of the cool big data projects you are working on?
Laith has received a lot of different opportunities at CorpInfo to work with cool and cutting edge technology. He started off his career ramping up on the SMB side. These clients typically provide us with the ability to take the reins and wrangle their data and services into the cloud, putting all of their trust in us, which is a valuable and rewarding experience. Some of the larger big data projects Laith has worked on are accounts with annual revenue over $200,000,000 and large teams in place. These projects are also fun because you are learning from the people around you and bouncing ideas off each other in a consulting capacity. It’s great to be able to get into the weeds with these large customers and strategize on cutting edge technology.
What is sharding data?
A lot of customers are now asking us questions around sharding databases and unifying sharded data sources into Amazon Redshift, Amazon’s fully managed petabyte-scale data warehouse. Sharding a database is where you are separating and distributing a horizontal partition of data around some definite sharding key – so a range of let’s say the first 100,000,000 rows of data go to one server and the next 100,000,000 rows go to another. A classic example is sharding by customer geographical location keys, such as “US” and “EU”. Each individual partition is referred to as a database shard and is held on a separate database server instance, to spread the load. In the legacy space it was difficult to scale up to one server and keep adding more storage and so forth. Sharding is a great way to distribute your data across multiple servers and load balance.
Tell us about how DynamoDB is changing the data landscape.
Laith has also worked on some big data pipeline projects and with new types of databases like DynamoDB. DynamoDB is a fully-managed Amazon NoSQL database service known for low latency and high scalability. In one instance he completed a migration from HBase into DynamoDB. Embracing any technology goes back to the application and working with developers to fully understanding what they are trying to accomplish with their new technology solutions. If they can do everything that they used to do in the past with their new database more efficiently, cheaper, and faster, then it’s a good idea to migrate. We’re now seeing customers who maintained huge HBase clusters (terabytes of data, NoSQL rows, etc.) being able to automatically drop their data into a solution like DynamoDB and there’s nothing to manage. It’s pretty amazing!
What is the future of Big Data and Analytics?
The potential for analytics is still being developed and Amazon is leading the way. As Laith continues to shape the future of CorpInfo’s big data practice there are three big things he is excited about. First, is looking at big data in the cloud, the overarching practice of analyzing terabytes and petabytes of data, complex data that you can’t just put into a SQL server or MySQL. Second is the Internet of Things or IoT, which is going to slam us with data if we aren’t prepared for it. Finally, the idea of serverless computing and real-time analytics.