vw - Vowpal Wabbit -- fast online learning tool
VW options: --random_seed arg seed random number generator --ring_size arg size of example ring Update options: -l [ --learning_rate ] arg Set learning rate --power_t arg t power value --decay_learning_rate arg Set Decay factor for learning_rate between passes --initial_t arg initial t value --feature_mask arg Use existing regressor to determine which parameters may be updated. If no initial_regressor given, also used for initial weights. Weight options: -i [ --initial_regressor ] arg Initial regressor(s) --initial_weight arg Set all weights to an initial value of arg. --random_weights arg make initial weights random --input_feature_regularizer arg Per feature regularization input file Parallelization options: --span_server arg Location of server for setting up spanning tree --threads Enable multi-threading --unique_id arg (=0) unique id used for cluster parallel jobs --total arg (=1) total number of nodes used in cluster parallel job --node arg (=0) node number in cluster parallel job Diagnostic options: --version Version information -a [ --audit ] print weights of features -P [ --progress ] arg Progress update frequency. int: additive, float: multiplicative --quiet Don't output disgnostics and progress updates -h [ --help ] Look here: http://hunch.net/~vw/ and click on Tutorial. Feature options: --hash arg how to hash the features. Available options: strings, all --ignore arg ignore namespaces beginning with character <arg> --keep arg keep namespaces beginning with character <arg> --redefine arg redefine namespaces beginning with characters of string S as namespace N. <arg> shall be in form 'N:=S' where := is operator. Empty N or S are treated as default namespace. Use ':' as a wildcard in S. -b [ --bit_precision ] arg number of bits in the feature table --noconstant Don't add a constant feature -C [ --constant ] arg Set initial value of constant --ngram arg Generate N grams. To generate N grams for a single namespace 'foo', arg should be fN. --skips arg Generate skips in N grams. This in conjunction with the ngram tag can be used to generate generalized n-skip-k-gram. To generate n-skips for a single namespace 'foo', arg should be fN. --feature_limit arg limit to N features. To apply to a single namespace 'foo', arg should be fN --affix arg generate prefixes/suffixes of features; argument '+2a,-3b,+1' means generate 2-char prefixes for namespace a, 3-char suffixes for b and 1 char prefixes for default namespace --spelling arg compute spelling features for a give namespace (use '_' for default namespace) --dictionary arg read a dictionary for additional features (arg either 'x:file' or just 'file') --dictionary_path arg look in this directory for dictionaries; defaults to current directory or env{PATH} --interactions arg Create feature interactions of any level between namespaces. --permutations Use permutations instead of combinations for feature interactions of same namespace. --leave_duplicate_interactions Don't remove interactions with duplicate combinations of namespaces. For ex. this is a duplicate: '-q ab -q ba' and a lot more in '-q ::'. -q [ --quadratic ] arg Create and use quadratic features --q: arg : corresponds to a wildcard for all printable characters --cubic arg Create and use cubic features Example options: -t [ --testonly ] Ignore label information and just test --holdout_off no holdout data in multiple passes --holdout_period arg holdout period for test only, default 10 --holdout_after arg holdout after n training examples, default off (disables holdout_period) --early_terminate arg Specify the number of passes tolerated when holdout loss doesn't decrease before early termination, default is 3 --passes arg Number of Training Passes --initial_pass_length arg initial number of examples per pass --examples arg number of examples to parse --min_prediction arg Smallest prediction to output --max_prediction arg Largest prediction to output --sort_features turn this on to disregard order in which features have been defined. This will lead to smaller cache sizes --loss_function arg (=squared) Specify the loss function to be used, uses squared by default. Currently available ones are squared, classic, hinge, logistic and quantile. --quantile_tau arg (=0.5) Parameter \tau associated with Quantile loss. Defaults to 0.5 --l1 arg l_1 lambda --l2 arg l_2 lambda --named_labels arg use names for labels (multiclass, etc.) rather than integers, argument specified all possible labels, comma-sep, eg "--named_labels Noun,Verb,Adj,Punc" Output model: -f [ --final_regressor ] arg Final regressor --readable_model arg Output human-readable final regressor with numeric features --invert_hash arg Output human-readable final regressor with feature names. Computationally expensive. --save_resume save extra state so learning can be resumed later with new data --save_per_pass Save the model after every pass over data --output_feature_regularizer_binary arg Per feature regularization output file --output_feature_regularizer_text arg Per feature regularization output file, in text Output options: -p [ --predictions ] arg File to output predictions to -r [ --raw_predictions ] arg File to output unnormalized predictions to Reduction options, use [option] --help for more info: --bootstrap arg k-way bootstrap by online importance resampling --search arg Use learning to search, argument=maximum action id or 0 for LDF --replay_c arg use experience replay at a specified level [b=classification/regression, m=multiclass, c=cost sensitive] with specified buffer size --cbify arg Convert multiclass on <k> classes into a contextual bandit problem --cb_adf Do Contextual Bandit learning with multiline action dependent features. --cb arg Use contextual bandit learning with <k> costs --csoaa_ldf arg Use one-against-all multiclass learning with label dependent features. Specify singleline or multiline. --wap_ldf arg Use weighted all-pairs multiclass learning with label dependent features. Specify singleline or multiline. --interact arg Put weights on feature products from namespaces <n1> and <n2> --csoaa arg One-against-all multiclass with <k> costs --multilabel_oaa arg One-against-all multilabel with <k> labels --log_multi arg Use online tree for multiclass --ect arg Error correcting tournament with <k> labels --boosting arg Online boosting with <N> weak learners --oaa arg One-against-all multiclass with <k> labels --top arg top k recommendation --replay_m arg use experience replay at a specified level [b=classification/regression, m=multiclass, c=cost sensitive] with specified buffer size --binary report loss as binary classification on -1,1 --link arg (=identity) Specify the link function: identity, logistic or glf1 --stage_poly use stagewise polynomial feature learning --lrqfa arg use low rank quadratic features with field aware weights --lrq arg use low rank quadratic features --autolink arg create link function with polynomial d --new_mf arg rank for reduction-based matrix factorization --nn arg Sigmoidal feedforward network with <k> hidden units --confidence Get confidence for binary predictions --active_cover enable active learning with cover --active enable active learning --replay_b arg use experience replay at a specified level [b=classification/regression, m=multiclass, c=cost sensitive] with specified buffer size --bfgs use bfgs optimization --conjugate_gradient use conjugate gradient based optimization --lda arg Run lda with <int> topics --noop do no learning --print print examples --rank arg rank for matrix factorization. --sendto arg send examples to <host> --svrg Streaming Stochastic Variance Reduced Gradient --ftrl FTRL: Follow the Proximal Regularized Leader --pistol FTRL: Parameter-free Stochastic Learning --ksvm kernel svm Gradient Descent options: --sgd use regular stochastic gradient descent update. --adaptive use adaptive, individual learning rates. --invariant use safe/importance aware updates. --normalized use per feature normalized updates --sparse_l2 arg (=0) use per feature normalized updates Input options: -d [ --data ] arg Example Set --daemon persistent daemon mode on port 26542 --port arg port to listen on; use 0 to pick unused port --num_children arg number of children for persistent daemon mode --pid_file arg Write pid file in persistent daemon mode --port_file arg Write port used in persistent daemon mode -c [ --cache ] Use a cache. The default is <data>.cache --cache_file arg The location(s) of cache_file. -k [ --kill_cache ] do not reuse existing cache: create a new one always --compressed use gzip format whenever possible. If a cache file is being created, this option creates a compressed cache file. A mixture of raw-text & compressed inputs are supported with autodetection. --no_stdin do not default to reading from stdin
Personal Opportunity - Free software gives you access to billions of dollars of software at no cost. Use this software for your business, personal use or to develop a profitable skill. Access to source code provides access to a level of capabilities/information that companies protect though copyrights. Open source is a core component of the Internet and it is available to you. Leverage the billions of dollars in resources and capabilities to build a career, establish a business or change the world. The potential is endless for those who understand the opportunity.
Business Opportunity - Goldman Sachs, IBM and countless large corporations are leveraging open source to reduce costs, develop products and increase their bottom lines. Learn what these companies know about open source and how open source can give you the advantage.
Free Software provides computer programs and capabilities at no cost but more importantly, it provides the freedom to run, edit, contribute to, and share the software. The importance of free software is a matter of access, not price. Software at no cost is a benefit but ownership rights to the software and source code is far more significant.
Free Office Software - The Libre Office suite provides top desktop productivity tools for free. This includes, a word processor, spreadsheet, presentation engine, drawing and flowcharting, database and math applications. Libre Office is available for Linux or Windows.
The Free Books Library is a collection of thousands of the most popular public domain books in an online readable format. The collection includes great classical literature and more recent works where the U.S. copyright has expired. These books are yours to read and use without restrictions.
Source Code - Want to change a program or know how it works? Open Source provides the source code for its programs so that anyone can use, modify or learn how to write those programs themselves. Visit the GNU source code repositories to download the source.
Study at Harvard, Stanford or MIT - Open edX provides free online courses from Harvard, MIT, Columbia, UC Berkeley and other top Universities. Hundreds of courses for almost all major subjects and course levels. Open edx also offers some paid courses and selected certifications.
Linux Manual Pages - A man or manual page is a form of software documentation found on Linux/Unix operating systems. Topics covered include computer programs (including library and system calls), formal standards and conventions, and even abstract concepts.