Translate your SAS code into Hive and create business impact

Change is never easy, particularly when the change is significant.

In one of our previous blogs, we identified the top 5 trends in the data science field. One of them is: free, open-source tools are winning the minds of analytics pros. We are expecting this trend to continue and even speed up in the next few years.

 

The industry shift to open-source

For many years, companies have been leveraging SAS for their data processing, advanced analytics, reporting, and even campaign execution needs. Many SAS codes were developed and set to run on a regular basis to support business decisions. With the adoption of new Big Data Infrastructure, many companies want to take full advantage of free open-source data science tools.
With the adoption of new Big Data Infrastructure, many companies want to take full advantage of free open-source data science tools.

One of the big challenges for companies starting on this new journey is to figure out how to convert existing SAS codes to either Hive/SQL, Python, or R codes.

And the conversion needs to be done smoothly in order to not disrupt business operations. It is not rocket science converting SAS codes to Hive/SQL or other data science codes. But it does require expertise in coding with those languages.

 

Client success: A healthcare leader moves to cloud-based analytics

Here is how we helped one of our clients in the healthcare industry translate their risk models from SAS to Hive:

Challenge: Move risk models and eliminate manual intensive SAS processes

The client wanted to automate its risk model scoring process from SAS to other cloud-based analytic solutions. The original SAS codes included SAS data preparation codes, SAS format files, SAS macros, and 48 model scoring codes on 3 different population segments. The client did not want to continue maintaining and running SAS codes while they were migrating to cloud-based platforms. They found that maintaining and running SAS codes was complex, manual, prone to human errors, and costly.

C2G solution: Proven process to convert code from SAS to HIVE

After we reviewed the existing SAS codes, available model scoring documents, and the clients’ cloud-based platform, we recommended to our client that we convert the SAS into Hive. After aligning with the client on the business requirements and approaches, we did the code conversion. Two notables were: 1) we converted SAS format files to lookup data tables, and 2) we converted SAS macros to Hive functions to streamline the process.

In each step, we ran Hive code to generate an output file from the same input files that were used for SAS code running. We compared output results from Hive code running with those from SAS code running. We made sure the outputs were 100% matched before we moved to the next step. In cases where they were not 100% matched, we debugged the Hive code and tested again until they were 100% matched.

Results: Faster processing and cost savings
We further worked with our client to integrate Hive codes into their system so that the model scoring is run automatically once all input files are ready. The new process enabled the client to reduce total processing time from 2 days to less than 5 minutes per run, and save thousands of dollars on software license costs each year.

 

C2G can drive similar success for your business

This level of success—in improved efficiency, productivity and cost reduction—is typical when we take advantage of free open-source data science tools upon converting from SAS to Hive. Clients appreciate the real-time capability that enables them to make decisions more rapidly with lower overhead cost. C2G Partners can do the same for your business.