Author Archives: David Scoville

Image of Python group in Excel Ribbon

Coming to an Excel Near You…

Python. Yes, the ever-popular Python programming language is set to be featured in Excel in the near future. Microsoft is partnering with Anaconda, a leading enterprise grade Python repository. Python in Excel leverages the Anaconda distribution for Python running in Azure, which includes the most popular Python libraries such as pandas for data manipulation, statsmodels for advanced statistical modeling, as well as Matplotlib and seaborn for data visualization. For now Python in Excel is only available in public preview for those in the Microsoft 365 Insiders program Beta Channel. However, this feature will roll out to Excel for Windows first, starting with build 16818, and then to the other platforms at a later date. But soon you will be able to do advanced data analysis in the familiar Excel environment by accessing Python directly from the Excel ribbon. No set-up or installation will be required. Read more about Python in Excel.

Can ChatGPT help us in Data Science?

By now you have probably heard about or tried ChatGPT.  ChatGPT is a Natural Language Processing “chatbot,” developed by OpenAI and released to the public in November of 2022 for testing. ChaptGPT is currently free to use on the web for anyone with an OpenAI account, which is currently also free. 

In a nutshell, you ask ChatGPT a question in a “conversational way” and it can use “the whole Internet” to help it provide a response – although its knowledge cutoff is currently 2021 (ask it about an event that occurred or a celebrity that passed away in the last year and you’ll see what I mean). The release of NLP chatbots like this is huge in a lot of ways, and scary in a lot of ways. Like… if AI like this can do your job in less time for less cost, is your job in danger? I’ve been hearing that one for years, and have somehow been able to stay under the radar.  😉

Anyway, how can ChatGPT be of use in analytics and data science? Well, let’s ask!

Q. How can ChatGPT help a data analyst?

A.  “ChatGPT can assist a data analyst in several ways, such as:

  1. Natural Language Processing (NLP) tasks, such as text summarization and sentiment analysis.
  2. Generating human-like text based on the data provided, such as automatically generating reports or descriptions of the data.
  3. Helping to clean and prepare data for analysis by identifying and extracting relevant information from unstructured text data.
  4. Automating repetitive tasks such as data entry or data cleaning.
  5. Answering questions about the data, by training on specific data sets, allowing the data analyst to query the model and receive insights on the data.

Overall, ChatGPT can save time and effort for a data analyst by automating certain tasks and providing insights and information quickly.

I input the question above into ChatGPT and the answer you see was provided in a matter of seconds – I didn’t write that answer. Interestingly enough, ChatGPT will not give you a duplicate answer if you input the same question twice. In fact, there is a “Regenerate Response” button at the bottom of the page which provides a new response in case you didn’t like the first one. Let’s try that…

A2.  “ChatGPT can assist a data analyst in a number of ways, including:

  1. Data cleaning and preprocessing: ChatGPT can be used to automate tasks such as removing outliers, filling in missing values, and transforming data into a format that is ready for analysis.
  2. Text summarization: ChatGPT can be used to summarize large amounts of text data, such as customer feedback or social media posts, making it easier for analysts to identify key trends and patterns.
  3. Natural language processing: ChatGPT can be used to extract insights from unstructured text data by identifying entities, sentiment, and topics.
  4. Generating reports and visualizations: ChatGPT can be used to generate automated reports and visualizations based on the data analyzed, making it easier for analysts to communicate their findings to stakeholders.
  5. Automating repetitive tasks: ChatGPT can automate repetitive tasks such as data entry and data validation, freeing up data analysts to focus on more complex and important tasks.

And you can ask ChatGPT to elaborate or give a more specific example of any part of the response you get. You can keep tweaking or rephrasing your question until you get the response that is most helpful or relevant. 

ChatGPT has a number of applications in writing code as well. You can ask ChatGPT to write the code needed to accomplish a task in virtually any programming language or even in common applications like Excel. For example, if you’re not familiar with all of the functions in Excel (and who is?), you can ask ChatGPT to write a formula or function needed to accomplish the task. You may need to input the coordinates (rows and columns) the formula or function should act upon and then you can likely just copy & paste the code into the formula bar or cells in Excel. Or if you are trying to automate a task, you can ask ChatGPT to write an Excel macro, then copy/paste the Visual Basic code into Excel – [Alt+F11] Win or [Opt+F11] Mac. There are also applications for learning to code: Let’s say you are reasonably fluent in R but are trying to learn Python. You can input some code in R and ask ChatGPT to give you the equivalent code in Python, or vice-versa. And ChatGPT may ask clarifying questions to help debug the code. And this is just the tip o’ the iceberg, as they say. There are limitations and thing to watch out for. You can find these and more information at the OpenAI > ChatGPT website. Very cool, try it while it’s still free!