pygmt.clib.Session.virtualfile_to_dataset
- Session.virtualfile_to_dataset(vfname, output_type='pandas', header=None, column_names=None, dtype=None, index_col=None)[source]
- Output a tabular dataset stored in a virtual file to a different format. - The format of the dataset is determined by the - output_typeparameter.- Parameters:
- vfname ( - str) – The virtual file name that stores the result data.
- output_type ( - Literal[- 'pandas',- 'numpy',- 'file',- 'strings'], default:- 'pandas') –- Desired output type of the result data. - "pandas"will return a- pandas.DataFrameobject.
- "numpy"will return a- numpy.ndarrayobject.
- "file"means the result was saved to a file and will return- None.
- "strings"will return the trailing text only as an array of strings.
 
- header ( - int|- None, default:- None) – Row number containing column names for the- pandas.DataFrameoutput.- header=Nonemeans not to parse the column names from table header. Ignored if the row number is larger than the number of headers in the table.
- column_names ( - list[- str] |- None, default:- None) – The column names for the- pandas.DataFrameoutput.
- dtype ( - type|- dict[- str,- type] |- None, default:- None) – Data type for the columns of the- pandas.DataFrameoutput. Can be a single type for all columns or a dictionary mapping column names to types.
- index_col ( - str|- int|- None, default:- None) – Column to set as the index of the- pandas.DataFrameoutput.
 
- Return type:
- Returns:
- result – The result dataset. If - output_type="file"returns- None.
 - Examples - >>> from pathlib import Path >>> import numpy as np >>> import pandas as pd >>> >>> from pygmt.helpers import GMTTempFile >>> from pygmt.clib import Session >>> >>> with GMTTempFile(suffix=".txt") as tmpfile: ... # prepare the sample data file ... with Path(tmpfile.name).open(mode="w") as fp: ... print(">", file=fp) ... print("1.0 2.0 3.0 TEXT1 TEXT23", file=fp) ... print("4.0 5.0 6.0 TEXT4 TEXT567", file=fp) ... print(">", file=fp) ... print("7.0 8.0 9.0 TEXT8 TEXT90", file=fp) ... print("10.0 11.0 12.0 TEXT123 TEXT456789", file=fp) ... ... # file output ... with Session() as lib: ... with GMTTempFile(suffix=".txt") as outtmp: ... with lib.virtualfile_out( ... kind="dataset", fname=outtmp.name ... ) as vouttbl: ... lib.call_module("read", [tmpfile.name, vouttbl, "-Td"]) ... result = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="file" ... ) ... assert result is None ... assert Path(outtmp.name).stat().st_size > 0 ... ... # strings, numpy and pandas outputs ... with Session() as lib: ... with lib.virtualfile_out(kind="dataset") as vouttbl: ... lib.call_module("read", [tmpfile.name, vouttbl, "-Td"]) ... ... # strings output ... outstr = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="strings" ... ) ... assert isinstance(outstr, np.ndarray) ... assert outstr.dtype.kind in ("S", "U") ... ... # numpy output ... outnp = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="numpy" ... ) ... assert isinstance(outnp, np.ndarray) ... ... # pandas output ... outpd = lib.virtualfile_to_dataset( ... vfname=vouttbl, output_type="pandas" ... ) ... assert isinstance(outpd, pd.DataFrame) ... ... # pandas output with specified column names ... outpd2 = lib.virtualfile_to_dataset( ... vfname=vouttbl, ... output_type="pandas", ... column_names=["col1", "col2", "col3", "coltext"], ... ) ... assert isinstance(outpd2, pd.DataFrame) >>> outstr array(['TEXT1 TEXT23', 'TEXT4 TEXT567', 'TEXT8 TEXT90', 'TEXT123 TEXT456789'], dtype='<U18') >>> outnp array([[1.0, 2.0, 3.0, 'TEXT1 TEXT23'], [4.0, 5.0, 6.0, 'TEXT4 TEXT567'], [7.0, 8.0, 9.0, 'TEXT8 TEXT90'], [10.0, 11.0, 12.0, 'TEXT123 TEXT456789']], dtype=object) >>> outpd 0 1 2 3 0 1.0 2.0 3.0 TEXT1 TEXT23 1 4.0 5.0 6.0 TEXT4 TEXT567 2 7.0 8.0 9.0 TEXT8 TEXT90 3 10.0 11.0 12.0 TEXT123 TEXT456789 >>> outpd2 col1 col2 col3 coltext 0 1.0 2.0 3.0 TEXT1 TEXT23 1 4.0 5.0 6.0 TEXT4 TEXT567 2 7.0 8.0 9.0 TEXT8 TEXT90 3 10.0 11.0 12.0 TEXT123 TEXT456789