洞察纵观鸿蒙next版本,如何凭借FinClip加强小程序的跨平台管理,确保企业在数字化转型中的高效运营和数据安全?
612
2022-10-29
plait.py - 一个用于从可组合的yaml模板生成假数据的程序
plait.py
plait.py is a program for generating fake data from composable yaml templates.
The idea behind plait.py is that it should be easy to model fake data that has an interesting shape. Currently, many fake data generators model their data as a collection of IID variables; with plait.py we can stitch together those variables into a more coherent model.
some example uses for plait.py are:
generating mock application data in test environmentsvalidating the usefulness of statistical techniquescreating synthetic datasets for performance tuning databases
features
declarative syntaxuse basic faker.rb fields with #{} interpolatorssample and join data from CSV fileslambda expressions, switch and mixture fieldsnested and composable templatesstatic variables and hidden fields
an example template
# a person generatordefine: min_age: 10 minor_age: 13 working_age: 18fields: age: random: gauss(25, 5) # minimum age is $min_age finalize: max($min_age, value) gender: mixture: - value: M - value: F name: "#{name.name}" job: value: "#{job.title}" onlyif: this.age > $working_age address: template: address/usa.yaml phone: # add a phone if the person is older than the minor age template: device/phone.yaml onlyif: this.age > ${minor_age} # we model our height as a gaussian that varies based on # age and gender height: lambda: this._base_height * this._age_factor _base_height: switch: - onlyif: this.gender == "F" random: gauss(60, 5) - onlyif: this.gender == "M" random: gauss(70, 5) _age_factor: switch: - onlyif: this.age < 15 lambda: 1 - (20 - (this.age + 5)) / 20 - default: value: 1
how its different
some specific examples of what plait.py can do:
generate proportional populations using census data and CSVscreate realistic zipcodes by state, city or region (also using CSVs)create a taxi trip dataset with a cost model based on geodistanceadd seasonal patterns (daily, weekly, etc) to data
usage
installation
# install with pythonpip install plaitpy# or with pypypypy-pip install plaitpy
cloning the repo for development
git clone https://github.com/plaitpy/plaitpy# get the fakerb repogit submodule initgit submodule update
generating records from command line
specify a template as a yaml file, then generate records from that yaml file.
# a simple example (if cloning plait.py repo)python main.py templates/timestamp/uniform.yaml# if plait.py is installed via pipplait.py templates/timestamp/uniform.yaml
generating records from API
import plaitpyt = plaitpy.Template("templates/timestamp/uniform.yaml")print t.gen_record()print t.gen_records(10)
looking up faker fields
plait.py also simplifies looking up faker fields:
# list faker namespacesplait.py --list# lookup faker namespacesplait.py --lookup name# lookup faker keys# (-ll is short for --lookup)plait.py --ll name.suffix
documentation
yaml file commands
see docs/FORMAT.md
datasets
see docs/EXAMPLES.mdalso see templates/ dir
troubleshooting
see docs/TROUBLESHOOTING.md
Dependent Markov Processes
To simulate data that comes from many markov processes (a markov ecosystem), see the plaitpy-ipc repository.
future direction
If you have ideas on features to add, open an issue - Feedback is appreciated!
License
MIT
版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。
发表评论
暂时没有评论,来抢沙发吧~