Optimize demo data loading by reducing in-memory consumption

### Problem Description

Downloading a dataset from SDV's demo datasets can be done using `download_demo` functionality
```python3
from sdv.datasets.demo import download_demo

data, metadata = download_demo('multi_table', 'fake_hotels')
```
Under the hood, the function will create three in-memory representation of the `fake_hotels`:
* `data_io` which is the data.zip directly downloaded in s3
* `in_memory_directory` which is a dictionary of the data read in bytes
* `data` which is a dictionary of the data after loading it in pandas

This is inefficient and causes out-of-memory issues when the dataset is large.

### Expected behavior

To optimize the code, we can update it to do the following:
* maintain only one dictionary -- the pandas dictionary
  * open the csv file and load it directly into pandas without the need for a separate `in_memory_directory` variable.
  * this should be done file by file such that we don't open multiple files at once.
* delete the `data.zip` from memory after finishing.

In the end, there should be only one variable in-memory which is `data`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize demo data loading by reducing in-memory consumption #2895

Problem Description

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Optimize demo data loading by reducing in-memory consumption #2895

Description

Problem Description

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions