I am technical fellow from Objectivity. For the last nine years I have been working extensively with data warehousing and BI using numerous Microsoft technologies and tools. Currently especially interested in Big Data concepts.
Scala and Apache Spark might seem an unlikely medium for implementing an ETL process, but there are reasons for considering it as an alternative. After all, many Big Data solutions are ideally suited to the preparation of data for input into a relational database, and Scala is a well thought-out and expressive language. Krzysztof Stanaszek describes some of the advantages and disadvantages of a scala-based approach to implementing and testing an ETL solution. … Read more
It is worth getting familiar with Apache Spark because it a fast and general engine for large-scale data processing and you can use you existing SQL skills to get going with analysis of the type and volume of semi-structured data that would be awkward for a relational database. With an IDE such as Databricks you can very quickly get hands-on experience with an interesting technology.… Read more