Catecon: Data Morphisms

Data morphisms in Catecon contain information that can be used in a composite. For example, f(2) = 6. The domain for a data morphism is generally ℕ, but the codomain may have a complex form such as (𝔽×𝔽)×(ℤ×ℤ)×(ℕ×ℕ).

The old data morphisms in Catecon were simply a mapping from an index to a value which was good for testing. But this can take a lot of memory. Now contiguous, random, and url ranges are included for a more compact representation.

A single data morphism can have as many data values as memory allows, followed by a sequence of ranges. To evaluate a data morphism at some index, the data values are first consulted. If there is a value for specified index it is returned. If not, the ranges are searched in sequence for one that contains the specified index.

Contiguous Range

A contiguous range, denoted more simply as range in Catecon for space consideration, is given by a starting index, a count for the number of succeeding indices, and a start value to increment for each index.

Random Range

A random range also has a starting index and a count, but also a min and max for the interval in which to generate a random number. Each time you compose with a data morphism containing a random range you get different random numbers. Compose with an identity map to have “static” random numbers, but at the cost of storage.

URL Range

A url range is one obtained from downloading a file. If the file is JSON, it needs to be an array, and each entry in the array needs to look like the data morphism’s codomain. There’s no validation that the data confirms to the codomain until you start evaluating the data. A url range has a start index and the count is determined from the length of the downloaded array. When you enter the URL and create the range, Catecon attempts to download the info and attach it to the range. When your diagram is saved, the URL data is removed. When your diagram is loaded, the data is downloaded. In this manner you do not save the downloaded data in your diagram when it is saved or uploaded.

CSV Files

Much data is saved as comma separated values, aka .csv, and often the separators are instead tabs. For example the GAIA data set is saved as .csv.gz files here.

The first file has the following first ten columns for the first data row:

solution_id	source_id	random_index	ref_epoch	ra	ra_error	dec	dec_error	parallax	parallax_error
1635378410781930000	65408	973786105	2015	44.99615	14.37993	0.005616	6.517028

This then appears to be a data morphism from ℕ to ℕ×ℕ×ℕ×ℤ×𝔽×𝔽×𝔽×𝔽×𝔽×𝔽 where the ℤ is for the reference epoch (watch out for year zero). Some columns have no data. These generate nulls which could lead to unexpected behavior depending on what your diagram is expecting.

Harry Dole

"…And one day a great and mighty wind will encompass the surface of the earth and wipe clean the scourge of wooly thinking once and for all!" — Time Bandits

Catecon: Data Morphisms

Contiguous Range

Random Range

URL Range

CSV Files

Catecon: Data Morphisms

Contiguous Range

Random Range

URL Range

CSV Files

Post navigation