Microsoft Excel
The Microsoft Excel connector bridges the gap between ad-hoc analytics and the world’s most widely used spreadsheet format. It allows users to query .xls or .xlsx files using standard SQL without manual imports, formula wrangling, or copy-pasting required.
At its core, the connector treats each worksheet as a table and each defined cell range as a structured dataset. When you query an Excel file, the connector dynamically maps rows and columns into a tabular schema—effectively transforming what's often a human-formatted report into something your query engine can reason about.
This connector is designed for both simplicity and precision. With just a few lines of configuration, you can open up entire workbooks for exploration. Yet, for more complex files—where data may begin several rows down, or occupy only a subset of columns—you can fine-tune behavior using a rich set of options that control how Metaform interprets the file structure.
Configuration Options
The simplest possible configuration enables the connector and lets it discover and read any Excel files within your workspace:
"excel": {
"type": "excel",
"extensions": ["xls", "xlsx"]
}
This setup is ideal for quick inspection or exploratory analysis. Metaform automatically detects sheets, infers column types, and exposes data as queryable tables.
Advanced Configuration
You’ll often want greater control—such as defining which sheet to query, where the data starts, or whether to use headers. All of these advanced options can be set directly in the table command, allowing you to fine-tune how Metaform reads and structures spreadsheet data at runtime.
The table below summarizes the complete set of available configuration parameters:
| Option | Default | Description |
|---|---|---|
| firstColumn | (none) | Defines the leftmost column index to read within a spreadsheet. Columns are 1-indexed, meaning column 1 corresponds to “A.” Setting this to 0 tells Metaform to begin at the first available column automatically. Use this when you want to target a specific subset of columns and skip descriptive or empty regions on the left. |
| headerRow | 0 | Identifies which row contains column headers. Rows are 0-indexed. Setting this to -1 tells Metaform that there are no headers and it should generate automatic field names (field_1, field_2, etc.). This is particularly useful for structured but unlabeled data—such as exports from legacy systems or generated reports. |
| lastColumn | (none) | Defines the rightmost column index to include in the read range. Columns are 1-indexed. When set to 0, Metaform reads all remaining columns to the right. Limiting the column range can improve scan performance when only a portion of the sheet is needed. |
| lastRow | 1048576 | Specifies the final row Metaform should process. The default value corresponds to Excel’s maximum row count (1,048,576). Typically, you don’t need to change this, but setting it to a smaller value can help exclude trailing summaries, blank space, or appended notes at the end of a worksheet. |
| sheetName | (first sheet) | Names the Excel sheet that Metaform should query. If omitted, Metaform defaults to the first sheet in the workbook. If the specified sheet is not found, Metaform throws an exception. Use this to target a specific tab in multi-sheet workbooks—common when reports store monthly or departmental data separately. |
Putting It Together
When you configure Metaform's Microsoft Excel connector, think of the spreadsheet as a coordinate grid—columns labeled from left to right (A, B, C, …) and rows numbered from top to bottom. The combination of firstRow, headerRow, and lastRow defines the window of data that Metaform will read.
Scenario
Let's imagine a common scenario: you receive a quarterly sales report exported from Microsoft Excel. The top of the sheet contains titles, blank rows, and metadata before the actual table begins.
| Customer ID | Region | Total Sales |
| --------------------------------------- | ------ | ----------- |
| **Quarterly Report – Eastern Division** | | |
| **Generated on:** 2025-09-30 | | |
| | | |
| C001 | VA | 12,000 |
| C002 | NC | 9,800 |
| C003 | MD | 11,250 |
| **Summary:** Eastern region totals only | | |
Without configuration, Metaform would attempt to parse everything—including titles and notes—as data, which can lead to empty fields or mismatched column types.
To extract only the table (rows 4 through 7, columns A through C), define the region explicitly:
"headerRow": 2,
"lastRow": 3,
"firstColumn": 1,
"lastColumn": 3,
"sheetName": "Q3_Report"
How this works
sheetName: Targets the "Q3_Report" sheet.headerRow: 2: Tells Metaform the fourth row (remember, rows are 0-indexed and blank rows are skipped) contains column headers.lastRow: 3: Stops reading before reaching the "Summary" text.firstColumn/lastColumn: Reads only columns A through C (1–3 in Excel terms).
With this configuration, Metaform will infer a clean three-column schema:
| Column | Type | Example Value |
|---|---|---|
Customer ID | STRING | "001" |
Region | STRING | "VA" |
Total Sales | DOUBLE | 12000 |
Working Example
To explore Excel support in Metaform using this example, you'll need two files:
- The Microsoft Excel connector configuration (
.json) - A sample Excel workbook (
.xlsx) to query
Installing the Connector
The Excel connector is distributed as a standalone JSON file that registers a new storage plugin with your local Metaform instance. It defines the connector type, supported file extensions, and parsing options.
To install the connector, run the following command:
curl -sSL https://docs.metaform.com/resources/examples/microsoft-excel-connector.json | curl -X POST -H "Content-Type: application/json" -d @- http://localhost:8047/storage/excel.json
This command downloads the connector definition and registers it via Metaform’s REST API. Once installed, the Excel format will appear in the Storage tab under the name excel, and will be immediately available for queries—no restart required.
Trying the Example Workbook
Download the example Excel file here:
https://docs.metaform.com/resources/examples/data.xlsx
The workbook contains a sheet named Q3_Report with:
- A title row and blank padding at the top
- Standard headers in row 4
- Three data rows
- A summary line at the bottom
This layout makes it ideal for testing the connector’s firstRow, headerRow, and lastRow parameters.
Save the file to a location accessible to your Metaform instance—for example:
~/data.xlsx
Then, try querying it in the Web user interface:
SELECT *
FROM table(excel.`data.xlsx` (
type => 'excel',
sheetName => 'Q3_Report',
headerRow => 2,
firstColumn => 1,
lastColumn => 3,
lastRow => 3))
If the connector is installed correctly, Metaform will:
- Recognize the
.xlsxfile - Infer a clean schema
- Return a three-column table based on the detected data region