# TidyData: Columns & Direct

This section gives the first simple examples of how we create TidyData from our selections.

## Source Data

The data source we're using for these examples is shown below:

| <span style="color:green">Note - this particular table has some very verbose headers we don't care about, so we'll be using `bounded=` to remove them from the previews as well as to show just the subset of data we're working with.</span>|
|-----------------------------------------|

The [full data source can be downloaded here](https://github.com/mikeAdamss/tidychef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx). We'll be using th 8th tab named "Table 3a".

In [None]:
from typing import List
from tidychef import acquire, preview
from tidychef.selection import XlsxSelectable

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/tidychef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 3a")
preview(table, bounded="A4:H10")

## TidyData & Columns

This example introduces two new classes.

- `TidyData` is a class representing tidy data created from the selections.
- `Column` represents a single column of data within this TidyData.

| <span style="color:green">Note - for this section we're going to use only a small sample of the potential observations we could select. This will look a little odd in the previews but is necessary to restrict the size of the TidyData previews to something practical for the context of this documentation.</span>|
|-----------------------------------------|

## Example 1: Time Periods

The simplest example where we just extract the observations and time period.

A new and critical part will be the following:

```
tidy_data = TidyData(
    observations,
    Column(period.finds_observations_directly(right))
)
```

Which specifies the visual relationship between the "period" selection of cells and the "observations" selection of cells is... "observations are directly to the right of period"

In [None]:
from tidychef import acquire, preview
from tidychef.direction import right
from tidychef.selection import XlsxSelectable
from tidychef.output import Column, TidyData

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/tidychef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 3a")

# Create our selections
# Note, we're not taking all observations as we want to keep the 
# output suitably small for this example.
observations = table.excel_ref("B9:C18").label_as("Value")
period = table.excel_ref("A9:A18").label_as("Period")

# First preview the selections
preview(observations, period, bounded="A4:H18")

# Then create some simple TidyData
tidy_data = TidyData(
    observations,
    Column(period.attach_directly(right))
)

# And view it
print(tidy_data)

## Example 2: Tidy Data

Now the above example doesn't accomplish a lot so lets expand our code a little more to add in some more columns.

If you've been working the documentation so far you should be able to make sense of the following - there's nothing new here.

**Do** take your time and go line by line to make sure you understand this, it's probably the most key example in this documentation.

In [None]:
from tidychef import acquire, preview
from tidychef.direction import right, down
from tidychef.selection import XlsxSelectable
from tidychef.output import Column, TidyData

table: XlsxSelectable = acquire.xlsx.http("https://github.com/mikeAdamss/tidychef/raw/main/tests/fixtures/xlsx/ons-oic.xlsx", tables="Table 3a")

# Create our selections
observations = table.excel_ref("B9:C18").label_as("Value")
period = table.excel_ref("A9:A18").label_as("Period")
housing = table.excel_ref("B5").expand(right).label_as("Housing")
annual_dataset_code = housing.shift(down).label_as("Annual Dataset Identifier")
quarterly_dataset_code = annual_dataset_code.shift(down).label_as("Quarterly Dataset Identifier")
monthly_dataset_code = quarterly_dataset_code.shift(down).label_as("Monthly Dataset Identifier")

# First preview the selections
preview(observations, housing, annual_dataset_code, quarterly_dataset_code, monthly_dataset_code, period, bounded="A4:H18")

# Then create some simple TidyData
tidy_data = TidyData(
    observations,
    Column(period.attach_directly(right)),
    Column(housing.attach_directly(down)),
    Column(annual_dataset_code.attach_directly(down)),
    Column(quarterly_dataset_code.attach_directly(down)),
    Column(monthly_dataset_code.attach_directly(down))
)

# And view it
print(tidy_data)