Metadata-Version: 2.1
Name: datasets-summarizer
Version: 0.1.2
Summary: Datasets Summary Viewer. Enables the exploration of dataset search results in Jupyter Notebooks
Home-page: https://github.com/soniacq/DatasetsVis
Author: Sonia Castelo
Author-email: s.castelo@nyu.edu
License: UNKNOWN
Description: # DatasetsSummarizer
        
        Datasets Summarizer is compatible with Jupyter Notebooks. Need the x and y values based on any similarity metric to generated the similarity plot between datasets. Supports the metadata format generated by [datamart-profiler](https://docs.auctus.vida-nyu.org/python/datamart-profiler.html#) library to generate the Detail View to explore each dataset.
        
        
        ![System screen](https://github.com/soniacq/DatasetsVis/blob/main/DatasetsSummarizer/imgs/datasets_summarizer_view.png)
        
        ( Click one dataset from the list of results to open the Detail View.)
        
        ## Demo
        
        Live demo (Google Colab):
        - [Dataset results for Taxi query](https://colab.research.google.com/drive/11NT0qudr2di9NXlR-DxZa0buzGreOqxs?usp=sharing)
        
        In Jupyter Notebook:
        ```Python
        import DatasetsSummarizer
        data = DatasetsSummarizer.get_taxi_data()
        DatasetsSummarizer.plot_datasets_summary(data)
        ```
        
        ## Install
        
        ### Option 1: install via pip:
        ~~~~
        pip install datasets-summarizer
        ~~~~
        
        ## Custom similarity metric
        
        Use a subset or add a new entry (`x` and `y` values ) based on a different similatiry metric. For example, here we added `x` and `y` values based on a similarity metric using a modified version of the titles. Note that `modif_title_x` and `modif_title_y` must be included in the dataframe.
        
        ```Python
        new_similarity_metrics = [{'name': 'Title', 'x': 'title_x', 'y': 'title_y'},
                                  {'name': 'ModifiedTitle', 'x': 'modif_title_x', 'y': 'modif_title_y'}
                                 ]
        ```
        
        Then, we can pass this new similarity metrics as a parameter of our visualization
        ```Python
        DatasetsSummarizer.plot_datasets_summary(dataframe, new_similarity_metrics)
        ```
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
