Can Gemma 4 replace ChatGPT for analyzing data from spreadsheets?
Gemma 4 and ChatGPT were tested with a range of real-world spreadsheet problems. These weren't synthetic tests, but rather real-world, messy data, flawed formulas, and realistic VBA requirements from real-world projects.
- Establish
- Test 1: Data Cleaning — Messy CSV with Inconsistent Dates and Mixed Formats
- Test 2: Pivot Table Logic - Designing and Building Pivot Table Formulas
- Test 3: Creating a VBA Macro - Merging Multiple Worksheets
- Test 4: Suggestions for charts and visualizations
- Test 5: Debugging Formulas - Diagnosing Formula Errors
- Practical advice
Every time Google releases a new open weighting model, people ask the same question: "Can I use this instead of paying for ChatGPT?" With Gemma 4, that question deserves a thorough answer — because for the first time, a free, locally-run model that's good enough for spreadsheet tasks makes sense.
Gemma 4 and ChatGPT were tested with a range of real-world spreadsheet problems. These weren't synthetic tests, but rather real-world, messy data, flawed formulas, and realistic VBA requirements from real-world projects.
Establish
For consistency, the article ran Gemma 4 (parameter version 27 billion) locally via Ollama on a computer and tested ChatGPT using GPT-4o via a web interface. Both models received identical prompts, word for word.
Each test was scored based on three criteria: accuracy of the output (does the formula or code actually work?), quality of the explanation (does it help the user understand what is happening?), and exception handling (does it take into account blank cells, errors, or unusual data?).
Test 1: Data Cleaning — Messy CSV with Inconsistent Dates and Mixed Formats
Scenario : Export a 2,000-line CSV file from an old CRM system. Dates appear in at least four formats — dd/mm/yyyy, mm-dd-yyyy, yyyy.mm.dd, and plain text such as "15 March 2025". Phone numbers mix country codes with local formatting. Product names have random capitalization and trailing spaces.
The same prompt was given for both models:
I have a CSV with a Date column that mixes dd/mm/yyyy, mm-dd-yyyy, yyyy.mm.dd, and written dates like '15 March 2025'. I need a single formula approach to normalise everything to dd/mm/yyyy in Excel. Also suggest a strategy for cleaning phone numbers that mix +91-XXXXXXXXXX with 0XX-XXXXXXXX formats and product names with inconsistent capitalisation.
Tôi có một file CSV với cột Date chứa các định dạng ngày tháng khác nhau như dd/mm/yyyy, mm-dd-yyyy, yyyy.mm.dd và cả các định dạng ngày tháng viết tay như '15 March 2025'. Tôi cần một công thức duy nhất để chuẩn hóa tất cả về định dạng dd/mm/yyyy trong Excel. Đồng thời, hãy đề xuất một chiến lược để làm sạch các số điện thoại có định dạng hỗn hợp +91-XXXXXXXXXX với 0XX-XXXXXXXX và tên sản phẩm có cách viết hoa không nhất quán.
Gemma's answer 4 is excellent . It suggests a nested approach using DATEVALUE in conjunction with TEXT and SUBSTITUTE, and correctly identifies that handwritten date formats like "15 March 2025" require a separate processing procedure. It suggests a supporting column strategy—parsing each format using IFERROR and combining them. For phone numbers, it suggests using the SUBSTITUTE string to remove hyphens and spaces, then using RIGHT to extract the last 10 digits. For product names, it correctly suggests PROPER(TRIM()).
ChatGPT's response is more refined . It provides a unique nested formula using LETs to define intermediate variables, making the formula more readable. It also proactively suggests the Power Query method as an alternative, something Gemma 4 doesn't mention. Regarding phone number cleaning, ChatGPT also warns that some Indian mobile phone numbers starting with certain digits might be misinterpreted and suggests a validation step.
Conclusion : ChatGPT wins this round, but not overwhelmingly. Gemma 4's method works well and correctly — it will work in a production environment. ChatGPT's answer is more comprehensive, better structured, and demonstrates a better understanding of real-world exceptions. If you are already familiar with data cleaning strategies, Gemma 4 provides you with enough information to get the job done.
Test 2: Pivot Table Logic - Designing and Building Pivot Table Formulas
Scenario : A sales dataset with columns Region, Salesperson, Product Category, Quarter, and Revenue. The task is to have each model suggest a suitable Pivot Table structure and then provide equivalent formulas for users who need formula-based methods (often encountered when the data source is frequently updated and you want the formulas to automatically recalculate).
Prompts were given for both models:
I have a sales table with Region (North/South/East/West), Salesperson (names), Product Category (Electronics/Furniture/Software), Quarter (Q1-Q4), and Revenue. Suggest a pivot table layout to analyse revenue by region and category, then give me the SUMIFS formulas to replicate this as a formula-based summary table.
Gemma 4 proposed a clear two-dimensional layout with areas as rows and categories as columns, exactly what most analysts want. The SUMIFS formulas it generates are accurate, with appropriate absolute and relative references. It also suggested adding a Grand Total row and column using the SUM function.
ChatGPT creates a similar layout but goes further — it suggests using GETPIVOTDATA for users who prefer a true Pivot Table, provides a SUMPRODUCT alternative for older versions of Excel, and includes conditional formatting suggestions to highlight cells with the highest revenue. It also suggests a filter-based approach for interactive panels.
Conclusion : ChatGPT wins again, primarily due to its depth and additional suggestions that less experienced users will find helpful. Gemma 4's core answer is accurate and useful — the SUMIFS formulas work perfectly. The gap here lies in the "what else you should consider" aspect rather than accuracy.
Test 3: Creating a VBA Macro - Merging Multiple Worksheets
This is where ChatGPT was expected to excel, and it did — but Gemma 4 surprised us.
Task : Write a VBA macro that iterates through all worksheets in a workbook (except the "Summary" worksheet), copies data from a consistent range (A2 to the last row in column D) on each worksheet, and pastes it sequentially into the Summary worksheet with the source worksheet name column added.
Gemma 4 has created a working macro. It uses the exact same loop For Each ws In ThisWorkbook.Worksheets, including checking If ws.Name <> "Summary", finding the last row using Cells(Rows.Count, 1).End(xlUp).Row, and adding data to the Summary sheet. The source sheet name is added to column E. The code runs without errors on the test workbook.
ChatGPT has created a more robust version. It includes error handling with conditions On Error Resume Nextsurrounding sheet operations, adds a confirmation dialog at the end showing the number of rows merged, clears the Summary sheet before writing (with a user confirmation prompt), and adds explanatory comments for each section. It also suggests a wrapper Application.ScreenUpdating = Falseto improve performance.
In conclusion : ChatGPT wins in terms of product quality. Gemma 4's macros work, which is really impressive for a free, local model. But ChatGPT's version is the one you'd really want to deploy in an enterprise environment—with its error handling, user feedback, and performance optimization capabilities. For anyone learning VBA through AI-powered macro creation, both models are useful starting points.
Test 4: Suggestions for charts and visualizations
A dataset was used to describe both models: monthly revenue and customer numbers for four product lines over two years, with the goal of presenting trends to a non-technical management audience. For each model, the question was asked what type of chart should be used and how to structure the visual representation.
Gemma 4 suggests line charts for revenue trends over time (one line per product), grouped bar charts to compare product lines by quarter, and combined charts (line + bar) to show revenue versus customer numbers on two axes. These suggestions are solid, conventional.
ChatGPT offers three similar suggestions but adds a sparkline chart suggestion for executive summary boards, a waterfall chart suggestion to show year-on-year revenue change, and specific formatting advice—suggestions for color palettes, axis label formatting, and a note about avoiding 3D charts for executive presentations. It also suggests a dashboard layout with charts arranged in a logical reading order.
Conclusion : ChatGPT wins in terms of presentation perception. If you already have experience with charts and visuals, Gemma 4's recommendations are perfectly suitable. ChatGPT's advantage lies in its design and communication layer — the kind of advice that makes the difference between a technically correct chart and one that truly communicates information effectively to stakeholders.
Test 5: Debugging Formulas - Diagnosing Formula Errors
The following flawed formula was pasted into both models. They were required to identify and fix all the issues:
=IFERROR(VLOOKUP(A2,Sheet2!B:F,5,TRUE),"Not Found")+IF(C2>"100",D2*0.1,D2*0.05)
There are several issues here : The VLOOKUP match type should probably be FALSE for accurate searching; the IF condition compares C2 to the text string "100" instead of the number 100; and IFERROR only wraps the VLOOKUP, but the entire expression could still throw an error if the IF part fails.
Gemma 4 detected two of the three issues. It correctly identified the TRUE/FALSE type matching problem and the text-to-number comparison issue in the IF statement. It missed the incomplete IFERROR coverage.
ChatGPT discovered all three issues. It rewrote the formula with an IFERROR function wrapping the entire expression, changed TRUE to FALSE, removed the quotation marks around the number 100, and suggested using XLOOKUP as a modern alternative if the user is using Microsoft 365. It also explained why each change was necessary.
In conclusion : ChatGPT is the clear winner here. Debugging formulas requires the model to infer about the many interacting parts of an expression, and ChatGPT's deeper analytical capabilities have been demonstrated. However, the fact that Gemma 4 detected two of the three issues is still really helpful — for many users, those two fixes alone solve their problems. For debugging more complex formulas, check out my guide on advanced Excel formulas.
Practical advice
After running these tests and extensively using both models, here are some practical tips:
- Let's start with Gemma 4 for everyday tasks . Writing formulas, basic data cleaning logic, simple VBA — Gemma 4 handles these well and is completely free. Install it via Ollama and let it run in the background.
- Switch to ChatGPT for more complex tasks . When you encounter a task requiring in-depth multi-step reasoning, file analysis, or production-level code analysis, switch to ChatGPT. The free plan meets most of these needs; the Plus plan is well worth it if you use it daily.
- Use both methods for learning . Ask Gemma 4 a formula, then ask ChatGPT the same question. Compare the approaches. This is one of the most effective learning strategies.
- Gemma 4 is used by default for sensitive data . If you're unsure whether your data should be uploaded to the cloud, the answer is to use a local model. You can always re-check ChatGPT with anonymized or sample data if you need its additional capabilities.
- Gemma AI and GPT-4: Which language model is superior?
- Gemma 4 outperforms paid AI models in 5 real-world tasks.
- Gemma 4 vs. Gemini: Which Google AI suite is right for your workflow?
- Comparing the performance of Qwen 3.5 and Gemma 4
- How to use Gemma 4 with the Gemini API and Google AI Studio
- What are the notable features of ChatGPT for Excel?
- Instructions for using the Find and Replace functions in Excel
- Comparing Google's Gemma 4 and OpenAI's GPT-5.3 Chat