Please visit https://bitbucket.org/kayontoga/rattle/commits/ for the definitive list of changes. This NEWS file will only update major releases going forward. rattle 5.3.6 20200501 Graham.Williams@togaware.com * Bug fixes: predict for xgboost; ignore NA in checking for binary class; varImp log handling. rattle 5.3.1 20191123 Graham.Williams@togaware.com * Convert datasets to be tibbles, moving towards tidyverse compliant. Require tibble package. Also require bitops as needed when loading a tibble as an RDataset into Rattle. rattle 5.3.0 20191123 Graham.Williams@togaware.com * Cleanup and prepare for CRAN release. rattle 5.2.8 20191016 Graham.Williams@togaware.com * Remove dependencies on archived packages RGtk2Extras and playwith. * Bug fix View Data button to just call R's View() rather than a glade data view or RGtk2Extras data viewer. * Update weatherAUS dataset with most recent weather observations now with 166,310 data points. rattle 5.2.7 20190407 Graham.Williams@togaware.com * Bug fix - the export PMML file dialog stopped working - perhaps a RGtk2 update. It was using glade. Instead create the dialog directly in code. Reported by Julian Ochoa. rattle 5.2.6 20190331 Graham.Williams@togaware.com * Bug fix - rattle was incorrectly specifying load instead of open in a GTK file chooser which did not work on an RGtk2 update. Reported by Dave Arnold and Eric Mattys. rattle 5.2.5 2018-09-26 17:23:00 Graham.Williams@togaware.com * Add labels to var imp plot for conditional forest. rattle 5.2.4 2018-09-23 10:43:12 Graham.Williams@togaware.com * Bug fix in getCategoricVariables() when no target is defined and include.target is true. * Bug fix in kmeans stats when the data is scaled. Reported by Caleb Terrel Orellana. rattle 5.2.3 2018-09-17 10:43:12 Graham.Williams@togaware.com * Bug fix random forest Errors and OOB ROC buttons are not available for conditional random forests. Reported by Cristina Acasuso. rattle 5.2.2 2018-09-14 21:12:31 Graham.Williams@togaware.com * Bug fix xgboost response command as it is now including two target columns from the previous bug fix which added target into predict. Reported by Aous Alex Abdo. rattle 5.2.1 2018-08-19 20:31:55 Graham.Williams@togaware.com * Improve handling of Missing RGtk2 message. * Bug fix re default maxdepth in rpart tree. Reported by Eugene Dubassarsky. rattle 5.2.0 2018-08-12 15:17:12 Graham.Williams@togaware.com * Remove dependency on RGtk2 and check it dynamically. Rattle has more functionality than just the GUI yet we force installation of RGtk2 which is problematic on some platforms. * Return the datasets to rattle package. Has caused too much confusion as a separate package. rattle 5.1.6 2018-08-12 15:17:12 Graham.Williams@togaware.com * Bug fix for new rpart.plot with roundint= handled automatically. * Reduce width of bars in ggVarImp() plot. rattle 5.1.5 2018-07-01 17:31:22 Graham.Williams@togaware.com * Remove deprecated connect-r logo. Reported by Bob Muenchen. * Correct and update Help menus. Reported by Bob Muenchen. * Remove Report button until updated to newer functionality. Reported by Bob Muenchen. rattle 5.1.4 2018-05-22 07:05:18 Graham.Williams@togaware.com * Bug fix: Remove na.omit from the calls to generate stats for a hierarchical clustering. Results in errors for the weather dataset. Reported by Tony Nolan. rattle 5.1.3 2017-10-29 21:25:08 Graham.Williams@togaware.com * Bug fix: xgboost evaluate to score file failing. Needs target in the precit command to succeed! Actually needs a fix to predict.xb.formula but include a work arond for now. Reported by Dwight Barry. rattle 5.1.1 2017-09-08 16:08:03 Graham.Williams@togaware.com * Update and bug fix to riskchart for risk AUC as provided by Cameron Chisholm. rattle 5.1.0 2017-09-04 08:20:34 Graham.Williams@togaware.com * Resolve all final check tests, redo testing, and release to CRAN. rattle 5.0.19 2017-07-10 15:14:34 Graham.Williams@togaware.com * Debug the xgboost interface - limited to binary classifcation tasks for now. Key is that the R code is exported and can be used as a template for extended modelling. rattle 5.0.18 2017-06-27 06:54:48 Graham.Williams@togaware.com * Tune Boost interface. rattle 5.0.17 2017-06-24 10:58:15 Graham.Williams@togaware.com * Move the dataset from the rattle package to a separate rattle.data package in line with CRAN guidelines to have a separate package for slower changing datasets of considerable size. This will also allow the option to provide further datasets for rattle as part of that package. * Use weather.csv as the sample for both R and Microsoft R as weatherAUS.csv is too large to include in a CRAN package. * Ensure strings are treated as categoricals on loading the data with Microsoft R so as to conform to read.csv() and to be consistent with the non-Microsoft R version of Rattle. rattle 5.0.16 2017-06-17 08:18:26 Graham.Williams@togaware.com * Update the weather dataset from the Australian Bureau of Meteorology and add sample weatherAUS.xdf to the package to be loaded as the default example when Microsoft R is detected. XDF is a file system based data format used by Microsoft R to handle datasets of *any* size rather than being limited by available computer RAM. rattle 5.0.15 2017-06-17 07:48:30 Graham.Williams@togaware.com * Merge RevoScaleR (Microsoft R) support for NeuralNetworks and KMeans from Microsoft India Data Group team. rattle 5.0.14 2017-06-12 13:12:00 Graham.Williams@togaware.com * Merge support for RGtk2 2.20.31 and 2.20.33 to resolve the bug across all installations. rattle 5.0.13 2017-06-05 16:51:18 Graham.Williams@togaware.com * Merge initial xgboost support from Zhou Fang. This is in testing and will become the default boosting algorithm soon. rattle 5.0.12 2017-05-30 13:52:51 Graham.Williams@togaware.com * Bug fix call to errorMatrix() where counts= is not count=. * Bug fix to evaluate where respcmd for random forest has disappeared when incorporating MRS updates. rattle 5.0.11 2017-05-26 17:43:01 Graham.Williams@togaware.com * RGtk2 version 2.20.33 released and caused some issues with Rattle. Heuristic test of libglade/GtkBuilder began failing and retval no longer used for obtaining returned values. rattle 5.0.10 2017-04-30 13:57:16 Graham.Williams@togaware.com * Review predict.hclust() to use cutree() by default and predict.kmeans() for a Euclidean distance approach as an option. Bug report by Hamed Mamani. rattle 5.0.9 2017-04-14 13:17:33 Graham.Williams@togaware.com * Introduce a checksum for R datasets (data.frames) so that we can identify when a user has changed the R dataset outside of Rattle, and have the new version loaded. rattle 5.0.8 Graham.Williams@togaware.com * Incorporate updates for ggratpR(). Close to functional and so it is nearly ready for release. rattle 5.0.7 2017-03-05 18:13:22 Graham.Williams@togaware.com * Support for rxDTree, rxDForest, rxGlm, rxLinMod. Thanks to Durga Prasad Chappidi. rattle 5.0.6 2017-02-25 09:51:55 Graham Williams * errorMatrix() more robust to character values and miss-match in factor levels. Thanks to Fang Zhou. rattle 5.0.5 Graham.Williams@togaware.com 2017-02-15 07:22:43 Graham Williams * Update weatherAUS dataset. * Bug fix for sample XDF dataset - if smaller the crv$xdf_preview then load the whole dataset into memory. * ggVarImp now has n= option for the top n variables. Also supports xgb.Booster models from xgboost. rattle 5.0.4 Graham.Williams@togaware.com 2017-02-04 15:15:34 Graham Williams * Bug fix ggVarImp to work for randomForest() when importance=FALSE. * Add log= option to ggVarImp() for a log scale. * Add pc (percentages) and digits to errorMatrix(). rattle 5.0.3 Graham.Williams@togaware.com 2017-02-02 15:18:28 Graham Williams * Add sample_n() for xdf - temporary until dplyrXdf supports it. rattle 5.0.2 Graham.Williams@togaware.com 2016-10-02 15:06:52 * Implement generic ggVarImp() to plot variable importance for different models. * Implement errorMatrix() as a replacement for generating code to do this pcme() during a rattle run. * Update the weather AUS dataset from the Australian Bureau of Meteorology. * Add a subtitle to riskchart(). rattle 5.0.1 Graham.Williams@togaware.com * Begin exposing :: prefix in the log tab. It's educational and self documenting. * Support Explore -> Distribution -> Group By to include the numeric target variable (usually only categorics listed) if it has 10 or fewer levels. Suggested by Eugene Dubassarsky. * Additional XDF support: rxDForest. rattle 5.0.0 Graham.Williams@togaware.com * Initial support for the XDF format: rxDTree. rattle 4.2.0 Graham.Williams@togaware.com 2016-07-22 06:19:15 * Include dplyr as an Import. * Add support for Eugene Dubassarsky's ggraptr. * Cleanup and perfect executeModelRF and Log code. rattle 4.1.8 Graham.Williams@togaware.com 2016-06-24 20:36:51 * Add transparency to ggpairs plot. Reported by Eugene Dubossarsky. rattle 4.1.7 Graham.Williams@togaware.com 2016-06-21 21:02:13 * Bug fix for Benfords when the target is numeric. An empty Group By will use the target variable to stratify. Reported by Eugene Dubossarsky. * Spelling fixes provided by George Wilson. rattle 4.1.6 Graham.Williams@togaware.com 2016-06-21 20:34:49 * Bug fix for new version of GGally - to get target colours. Reported by Eugene Dubossarsky. rattle 4.1.3 Graham.Williams@togaware.com 2016-05-12 10:22:01 * Update copyright to 2016. * Add stringr dependency. * Fix missing comment character in log tab. rattle 4.1.3 Graham.Williams@togaware.com 2016-03-13 15:07:07 * Add type= to fancyRpartPlot(). Requested by Michelle Gosse. rattle 4.1.2 Graham.Williams@togaware.com 2016-03-13 06:24:21 * Bug fix for missing GUI code for export_filechooserdialog. Reported by Bill Burns. rattle 4.1.1 Graham.Williams@togaware.com 2016-01-26 19:50:07 * Bug fix for a single input variable in the dataset when scoring. Reported by Szabo Szilard. rattle 4.1.0 Graham.Williams@togaware.com 2016-01-26 11:12:01 * Bug fix calculation of confusion matricies when either actual or predictive values have missing values. Reported by Roger Bohm. * Make the rescale.by.group transform more robust by ensuring the by argument is a factor, converting as needed. Reported by Tony Nolan. * Bug fix for plots when there is no target in the dataset. Reported by Albert Lee. * Bug fix in calculation of the overall error rate in the confusion matrix. Show overall error as percentage not proportion. reported by Eugene Dubossarsky. * Remove grid from ggpairs plot and fine tune for presentation. rattle 4.0.0 Graham.Williams@togaware.com 2015-09-21 06:00:49 * Migrate hosting of the package to Bitbucket: https://bitbucket.org/kayontoga/rattle. * Use Connect-R logo as the icon for the button. rattle 3.5.11 Graham.Williams@togaware.com 2015-09-16 19:22:02 * Add button to toolbar to open a Connect-R page for feature requests. * Bug fix confusion matrix Error calculation and average error calculation. Reported by Eugene Dubossarsky. * Only default to TIME* variable as target if Survival model is chosen. rattle 3.5.10 Graham.Williams@togaware.com 2015-09-16 19:22:02 * Explore tab's Distribution option now allows the user to choose how to group the data for plotting, with the Target as the default but a choice of any Categoric vairable available, or none. * Bug fix when scoring a clustering with no identifier nor target. Reported by Abhishek Sharma. rattle 3.5.9 Graham.Williams@togaware.com 2015-09-16 05:53:00 * Incorporate pairs plots into Distributions option of the Explore tab. Contributed by Jose A MagaƱa. rattle 3.5.8 Graham.Williams@togaware.com 2015-08-28 10:21:59 * Migrate histogram plots to using pipes and generally clean up the code. * Introduce appendLibLog to handle namespaces in the Log tab. Namespace prefix is removed and replaced by a library() call as a user would normally do. * Migrate Box Plots to using pipes and place multiple box plots or histograms onto a single grid. rattle 3.5.7 Graham.Williams@togaware.com 2015-08-21 19:17:56 * Move to using clusplot from cluster rather than plotcluster from fpc to obtain ellipses to show the clusters. rattle 3.5.6 Graham.Williams@togaware.com 2015-08-20 21:30:29 * Gracefully handle no network connection in rattleInfo(). rattle 3.5.5 Graham.Williams@togaware.com 2015-08-17 19:29:41 * Bug fix for traditional graphics and ROCR suite of plots under evaluate tab - need to use namespace to get correct version of plot(). rattle 3.5.4 Graham.Williams@togaware.com 2015-07-26 12:07:02 * Add palettes= to allow limited changing of colours in fancyRpartPlot(). * Bug fix for fancyRpartPlot() where rule conditions were being replaced with coloured blocks. rattle 3.5.3 Graham.Williams@togaware.com * Add a test to riskchart() to if there are more than two classes. rattle 3.5.2 Graham.Williams@togaware.com * Extend Error Matrix calculations in Evaluate to support multinomial targets as well as binomial targets. rattle 3.5.1 Graham.Williams@togaware.com * Bug fix in calculation of overall and average class errors. Thanks to Eugene Dubossarsky. rattle 3.5.0 Graham.Williams@togaware.com * Replace xlsx::read.xlsx() with readxl::read_excel() to remove reliance on Java which has always been problematic in terms of Windows users having trouble installing Java. Thanks to Ed Stoker for testing. (3.4.3) * When iterating over kmeans clusters now plot from 1 cluster rather than 3. Thanks to Eugene Dubossarsky. (3.4.4) * Updates to normVarNames() due to Hadley's changes to stringr. Also capture other characters to map. * Add title.size argument to riskchart(). Also support horizontal legend. Fix the text glob for the Lift label. * Revert to using only exported functions from pkgDepTools. (3.4.1) * Fix some tooltip and textview typos suggested by Kees Schippers. (3.4.2) * move from weightedKmeans to wskm. * Numerous updates to support new CRAN checks, particularly related to use of name spaces and requiring to make rattle depend on RGtk2. * weatherAUS dataset is updated. * Update rattleInfo() to be more efficient by doing dependency graph myself. rattle 3.4.0 Graham.Williams@togaware.com 2014-12-29 19:11:59 +11:00 * Revert traditional ROC eval plot to overlay all models on the one plot. Eugene Dubossarsky * Bug fix to fancyRpartPlot() from John Vorwald when model$frame$yval all negative. * Replace comma in normVarNames(). * Remove latticist - no longer avaliable on CRAN. rattle 3.3.0 Graham.Williams@togaware.com 2014-09-09 18:25:21 +1100 * Migrate to using namespace for external functions. rattle 3.2.0 Graham.Williams@togaware.com 2014-09-04 06:14:03 +1100 * Execute button when clicked from the Log Tab will execute all of the code in the Log tab. Suggested by Scott MacLean, 24 July 2014) * Add the average error rate to the evaluations, as proposed on http://www.connect-r.com/. * Numerous ggplot2 updates and bug fixes. * MS-Windows support for xlsx files bug fixed. Allow sub= option in fancyRpartPlot. rattle (3.1.0) * Numerous updates of plots to use ggplot2 rather than base graphics: ROC curves, riskchart, box plots, histogram plots, pairs plot, Benfords. Advanced Graphics is now the default, reverting to tradition graphics where needed. The migration to ggplot2 is ongoing. * Added new Benfords functionality. * Added a rescale option to kmeans. * New psfchart() for evaluation. * New function normVarNames() to normalise variable names to a standard preferred style * Evaluate -> Error Matrix has been updated to report averaged class error and to report class errors. * Evaluate -< PrvOb plot bug fix for non-missing data. * INSTALL: Remove old INSTALL file - visit rattle.togaware.com for installation instructions. * plotNetwork() has been removed - not used by Rattle and generally of limited use. See onepager.togaware.com for the code. * No longer report repository revision number in version or about. * Miscellaneous bug fixes and stability improvements. * weatherAUS dataset is up-to-date. -- Graham Williams 2014-07-18 14:32:07 +1100 rattle (2.6.26) unstable; urgency=low * Replace .path.package with path.package as requested by Ripley. The hidden version will disappear soon and the new version has been available since 2.13.0. * Update boost help to note that it is available only for binary classification. * Default stemming for textmining of a corpus is no active if the Snowball package is available. * For Advanced Graphics introduce a dendrogram plot using ggplot2. * Various text mining improvements. Bug fix in checking if data needs reloading. Support checking if corpus needs reloading. Add extra cursor and status bar messages. For corpus, set default folder to be getwd(). Check for mismatch between number of docs in corpus and the number of targets in .targets.csv. For the Corpus file dialog, do not offer folder creation. * Remove macosx special rattle.ui. The ubuntu specific text no longer appears in the saved ui file. * Internally: Move rattleGUI to crv from crs. The crs is saved as the state, and this was confusing the GUI on a project restore. Had to ensure we restored rattleGUI with the current rattleGUI - this fixes loadProject bug. Also, in Load project, filter on .RData not .Rdata. * Add newdata= to call to predict, in line with the standard approach by party (reference Torston). Remove the OOB= for predict for cforest. With a new dataset OOB makes no sense. It was in there because newdata= was not being used and positionally having issues. * Update fancy rpart plot to reduce colour intensity for printing and a nicer tree structure. Add all class probs to fancy tree. * Define paste0 if it is not defined. It was introduced in 2.15.0 but is too early to assume the world is with us. * Replace siatclust with weightedKmeans. * Bug fix in OOB plot when impute is off - need to omit missing values. Update message regarding random forest and na.omit() removing all rows, noting the option to use na.roughfix(). * Fix bug identified by Brian Feeny 121209 - score a RF test dataset without a target variable tries to add one in all NA but fails if it is the last variable. * Experimentally add Deducer's data.viewer to View data. Ensure we ask user if when using Plot Builder it is okay to create a dataset in their work space. Hopefully keeps us in line, if not strictly in copmliance, with CRAN policy. * Remove SVG support - RSvgDevice is no longer available. -- Graham Williams Sat, 16 Mar 2013 13:27:05 +1100 rattle (2.6.25) unstable; urgency=low * Review all of the code and remove two instances of using copyrighted code without attribution. One was a copy or print.rpart, where rattle added a translation wrapper to the text message. Another was code copied from the Internet from David Hand - use the Hmeasure package now. Note in drawTreeNode() reference to the original author and lack of copyright. Note author in Authors@R. ggcorplot is now available from Deducer. Remove it from Rattle. Replace Hand measure with HMeasure from hmeasure. Add Mark Vere Culp as aux author. Remove commented out code. Remove lss and cranSearch - not really part of Rattle. * Update to new style Authors@R. -- Graham Williams Sat, 23 Jan 2013 13:12:43 +1100 rattle (2.6.24) unstable; urgency=low * Bug fix for box plot using ggplot2. * Finish the implementation of riskchart using ggplot2 to mimic the old version of risk charts. * Remove copied code from print.rpart, known as rattle.print.rpart, and originally used without proper credit to Brian Ripley, but no longer required. Use his original versoin from rpart itself, though lose the translations. * Migrate to a cleaner structure for managing the source package locally at togaware. * Bug fix fancy rpart plot to handle regression as suggested by Yana Kane-Esrig. * For arules, add option to specify minimum length. * Update to new version of RGtk2Extras' dfedit, without a pretty_print option. Also able to assign result into crs$dataset now. * Remove two instances of global variable assignments. Temporarily remove PlotBuilder and scoring of manually entered datasets. -- Graham Williams Tue, 11 Dec 2012 06:45:50 +1100 rattle (2.6.21) unstable; urgency=low * Retain depend on R > 2.12.1. * Ensure rattle.togaware.com repo is maintained. * Better detect arules error message for duplicate items in a basket. * Update ggplot2 calls to conform to 0.92. Also turn advanced graphics on by default. Implement risk charts using ggplot2. * Start introducing suppressPackageStartupMessages to avoid excessive messages in the console. * Do AUC only for binomial targets. -- Graham Williams Mon, 10 Sep 2012 19:27:42 +1000 rattle (2.6.20) unstable; urgency=low * Because of use of globalVariables Rattle now depends on R >= 2.15.1. However, check this conditionally to retain backward compatibility for now. Reported by Uwe Ligges. * For show arules, eval in global environment else it does not show the rules. Reported by Tania Churchill. -- Graham Williams Mon, 23 Jul 2012 02:27:18 +1000 rattle (2.6.19) unstable; urgency=low * Depend on weightedKmeans rather than siatclust. * Bug fix: correlation plots stopped working. * Bug fix: ggcorplot use of size_scale started failing. Perhaps because of new version ofggplot2. * Bug fix: notice when a restored project does not have a filename set. * Fix some logic errors in rf. * Add 0,0 point to evaluateRisk. * Make risk, recall, precision as default names in risk chart. * Add new riskchart funciton using ggplot2. * Allow additional arguments to fancyRpartPlot passed through to prp. * Update copyrigt to 2012. * Allow y for yes in installing initial RGtk2. * List global variables to avoid check messages. -- Graham Williams Wed, 04 Jul 2012 22:15:27 +1000 rattle (2.6.18) unstable; urgency=low * Ensure require uses quietly rather than quiet. * Clean up randomForest textview output. * Update pmml to 4.0. Fix various format issues and other updates from Tridi of Zementis. * Update setupDataset but also note that it is moving into a separate package, container. * Get odfweave stuff working again. * Update fancyRPartPlot - being used in SIAT software. Can now handle any number of classes. * Updates to the pmml rsf code. * Bug fix for evaluation of conditional trees and random forests. * Further pmml export of randomForest updates. * Add PlotBuilder as interative explore option. * Export pmml for glm models. * Enhance ggplot2 plotting of boxplot. -- Graham Williams Sun, 22 Apr 2012 21:47:00 +1000 rattle (2.6.17) unstable; urgency=low * Add a log10 transform to the GUI, R10 prefix, add tooltip, handle it in pmml, create new rattle_macosx.ui. Suggested by Christophe Klopp. * Bug fix usage of believeNRows - it was being ignored from the GUI, but is now acted upon. Reported by Andrew Elliott. * Add ggplot2 box plots to Advanced Graphics option. * Remove the timestamp messages. * Update pmml to handle randomForest and rattle to export to pmml. * Bug fix in naming the dataset when it is editted. * Bug fix for ggcorplot when less than 6 vars - need to map var names into a c() call. -- Graham Williams Sun, 19 Feb 2012 21:49:45 +1100 rattle (2.6.16) unstable; urgency=low * rattleInfo() now also notes if rattle itself needs upgrading. * Bug fix in show association rules. It now works again. * Forgot to include rescale.by.group() in NAMESAPCE. * CITATION to the book rather than the article. That is a more definitive resource, though not freely available. -- Graham Williams Sat, 24 Dec 2011 15:35:21 +1100 rattle (2.6.15) unstable; urgency=low * Bug fix for Mac OS/X on 2.14.0 with a call to set.cursor failing because the textviews do not yet exist. Problem is that the addFromFile for the GUI is generating a Warning that seems to now stop the file being loaded. Removing the particular XML elemnts causing the warning (one ubuntu_local and 4 GtkTreeSelections) "fixes" the problem. -- Graham Williams Sat, 03 Dec 2011 22:49:18 +1100 rattle (2.6.14) unstable; urgency=low * Add OOB ROC button to Forest option of Model tab as suggested by Akbar Waljee. * Bug fix for loading R Dataset data frame named dataset. Bug reported by George Dontas. * Use roc.plot() from evaluation. Suggested by Akbar Waljee. * Use packageStartupMessage. * Ensure oob roc plot handles numeric targets. -- Graham Williams Wed, 16 Nov 2011 06:01:17 +1100 rattle (2.6.13) unstable; urgency=low * Add wtd.quantile type to binning. Suggested by Brenton R. Stone. -- Graham Williams Tue, 25 Oct 2011 21:34:13 +1100 rattle (2.6.12) unstable; urgency=low * Ensure the data partitions that are specified are appropriate. Also allow some flexiblity in specifying: 70 or 70/30 or 70/15/15. For the first two the training is 70% and testing is 30%. For the third, validation is 15% and testing is 15%. * Update text mining support for lates version of tm. * rattleInfo() was incorrectly counting the unmber of packages listed. -- Graham Williams Sun, 23 Oct 2011 06:00:16 +1100 rattle (2.6.11) unstable; urgency=low * Use listAdaVarsUsed in Rattle. * Use fancyRpartPlot in Rattle. * Note rattle.ui requires gtk > 2.16, not > 2.20. Otherwise fails to start on Mac OS/X. -- Graham Williams Wed, 05 Oct 2011 19:12:28 +1100 rattle (2.6.10) unstable; urgency=low * Add listAdaUsedVars support function. * Workaround CairoDevice issue on Windows by defaulting to not using it, as in the Settings menu. * Add common name and crv constant for ewkm. * fancyRpartPlot has optional main title as empty string. * biclust now reports a biclust built rather than reporting a kmeans built. * Add weights plots for ewkm from siatclust. -- Graham Williams Sun, 11 Sep 2011 17:08:18 +1000 rattle (2.6.9) unstable; urgency=low * AdaBoost now also reports which variables are used in the collection of trees built, and the number of trees in which a variable appears. * Add setupDataset and whichNumeric to support encapsulation of data mining objects. * Add a fancyRpartPlot so my fancy rpart tree is available outside of the rattle GUI. * Correct the textview information relating to confusion matrices. * Add doRiskChart to simplify using the risk charts. -- Graham Williams Sun, 04 Sep 2011 21:03:32 +1000 rattle (2.6.8) unstable; urgency=low * Ensure ggplot2 loaded before plot ctree. * Handle probability predictions for ctree and cforest in evaluation. -- Graham Williams Tue, 26 Jul 2011 22:03:47 +1000 rattle (2.6.7) unstable; urgency=low * Add support for the entropy weighted k-means subspace clustering algorithm from the ewkm package. * Ensure rattle can load with only the base package installed (so install.packages is prefixed with utils:::). * Migrate from using installed.pacakges() since it can be very slow on MS/Windows. * Add an experimental dataset option to the command line call to rattle. * Allow a bygroup to be used for any numeric transform. * Add a plot for association rules. * Display a ggplot2 scatterplot if advanced plots is enabled. * rattle:::executeExplorePlot made more friendly for calling from outside of Rattle. * Tidy up the rattleInfo manual page. * Master Makefile should respond with help if no target specified. -- Graham Williams Mon, 18 Jul 2011 06:53:47 +1000 rattle (2.6.6) unstable; urgency=low * Settings/Tooltips should be shown as TRUE. * Add Settings/GGPlot2 to enable enhanced graphics (generally using ggplot2) where they have been implemented. * Implement a ggplot2 pairs plot (scatterplot) as the plot to use when ggplot2 is enabled and under Explore/Distriubtions no variables are chosen to be displayed. Uses ggcorplot from Deducer. * Implement use of rpart.plot's prp() when ggplot2 is enabled. -- Graham Williams Sat, 09 Apr 2011 22:16:29 +1000 rattle (2.6.5) unstable; urgency=low * Add rattleReport() - report on current state of rattle modelling. * Restore the ByGroup option for now until it can be coded for the about transforms. * Deal with UTF-8 encoding of Japanese filenames in data and evaluate, using iconv. * Be sure to include http:// in web links, though on MS/Windows still not working: Could Not Show Link... No application is registered as handling this file * On loading a dataset, convert any character variables to be factors. Rattle does not handle character variables, so the translation seems appropriate. * Association rules status bar was refering to decision trees. Fixed. (Pointed out by Xiaobo Gu) * Fix an introduced bug in handling of categorics in numeric transforms. * Fix a bug where imputation for a categoric with class "ordered" and "factor" was treating it as a numeric (because "ordered" is not "factor"). * Some Help menu items under Test were not loading the required package and thus were not displaying the help. * Only do crosstabs when we have categoric variables. * Updated translations. -- Graham Williams Sun, 13 Mar 2011 16:46:20 +1100 rattle (2.6.4) unstable; urgency=low * Confusion matrices transposed to conform to what most people exect: Actual is on left and Predicted is on top. Retain the name as Error Matrix in Rattle for now. * Use different pch for a dotchart. * Include the install.packages(rattleInfo()) trick in the output of rattleInfo(). -- Graham Williams Sat, 19 Feb 2011 06:26:09 +1100 rattle (2.6.3) unstable; urgency=low * weather.arff Date field should have 'date' data type. * The rug plot of histograms is no longer coloured. For large datasets, there is much overplotting and so it can in fact be quite misleading. * Box plots now use varwidth=TRUE to indicate the distribution of the target variable. * Bug fix: exportHClustTab should not have a file argument. -- Graham Williams Sun, 13 Feb 2011 21:42:11 +1100 rattle (2.6.2) unstable; urgency=low * Rename rattle.info() to rattleInfo(), modelled on sessionInfo() naming. Include available CRAN version of rattle in the output. * Ensure connection is closed on pmmltoc export from Rattle. * questionDialog needs to not use RGtk2 if RGtk2 is not installed! * Emphasise that Rattle is free in loading the rattle package. * exportKmeansTab does not require the file argument. -- Graham Williams Wed, 02 Feb 2011 05:46:28 +1100 rattle (2.6.1) unstable; urgency=low * When exporting a regression model, be sure to use proper slash (i.e., not the Windows slosh) for log tab record of the command. * Add rattle.ui to the google code repository. * Remove as many literals as possible from the Log tab - so that crs$dataset[crs$sample, c(2:10,14,16:20)] becomes crs$dataset[crs$sample, c(crs$input, crs$target)], for example. Similarly for the set.seed and other data storing variables. * Other Log tab cleanup. * Fix bug that caused failure on reading an .xls data file. * rattle.info() now returns the list of packages that need updating. * In exporting a model as C code, if we are Japanese on Windows then note that the encoding is shift-jis rather than utf-8 for some reason. * Improve infrastructure for the generation of C code from PMML. -- Graham Williams Thu, 13 Jan 2011 21:50:53 +1100 rattle (2.6.0) unstable; urgency=low * Keep track of project names and use as default name to save a project to. Suggested by David Cochrane. * Add strip.white to the default for reading CSV files. Suggested by Robert Muenchen. * Bug fix on resetEvaluateTab - Data row was being reset to sensitive because model was being toggled. * Disconnect Rattle versions from google code revision numbers since the revision numbers change each change to the Wiki. * Indicator Variables will Ignore the first of the new indicator variables. Suggested by Robert Muenchen. * Include the Target name in listing of a decision tree as a rule set. * On adding to the log when saving a plot make sure carioDevice is loaded and the file name path separators are appropriate. Reported by Shane Butler 11 Dec 2010. * Ensure filename string is UTF-8 when exporting a file, to handle Japanese filenames. * For nnet, choose a seed so weather generates a non-trivial model. * Refer to remapping as recoding in line with commonly used terminology. * Default back to showing text on icon for buttons. Seems okay in the new version of Gtk. -- Graham Williams Sat, 11 Dec 2010 13:39:55 +1100 rattle (2.5.47) unstable; urgency=low * Add a useGtkBuilder argument to rattle(). If NULL, then heuristically determine, otherwise go with the specified choice, if possible. * Remove RGtk2, colorspace, and pmml as dependencies. Now dynamically check and offer to install. This also helps reduce chance of the XML/RGtk2 zlib1.dll bug, and also ensure RGtk2 loads before XML to avoid that bug. -- Graham Williams Mon, 15 Nov 2010 21:50:15 +1100 rattle (2.5.46) unstable; urgency=low * Bug fix for fixTranslations. * Save weights information in PMML. * Cleanup SVM command generator. -- Graham Williams Thu, 11 Nov 2010 19:08:36 +1100 rattle (2.5.45) unstable; urgency=low * Check for GtkBuilder handling of the 'requires' tag, and if not handled the don't use GtkBuilder. * Bump pmml version through 1.2.25 to 1.2.26. * Change default nolan groups for a singularity to 50 rather than 99. * PMML bug fix when glm and using weights. * Move all variable initialisation from .onLoad to .onAttach. This will ensure .RData saved (and therefore old) versions of the variables will not overwrite the proper versions in a newer release of Rattle. -- Graham Williams Sat, 09 Oct 2010 08:16:15 +1100 rattle (2.5.44) unstable; urgency=low * Add an include.libpath to rattle.info() to provide information about where the packages are installed. * Check for failed startup of rattle GUI using GtkBuider (because the Gtk library installed does not recognise 'requires' and suggest a workaround). * Condiionally turn toolbar Text (in addition to just Icons) on. * For loading spreadsheets, make sure RODBC is available and loaded. * Ensure 'ordered categoric' are treated as categoric for Explore, Distribution. -- Graham Williams Tue, 05 Oct 2010 18:08:20 +1100 rattle (2.5.43) unstable; urgency=low * Ensure gtkBuilder is setting the correct translation domain for the interface. * Add global option for not showing timestamps: crv$show.timestamp. * Add optional arg to newProject to not ask about overwriting a project. Default is as previously - to ask. -- Graham Williams Wed, 22 Sep 2010 05:37:53 +1000 rattle (2.5.42) unstable; urgency=low * Update rattle.info() to recursively identify all dependencies, report their version number and any updates available from CRAN and generate command to update packages that have updates available. See ?rattle.info for the options. * Fix bug causing R Dataset option of the Evaluate window to always revert to the first named dataset. * Fix bug in transforms where weights were not being handled in refreshing of the Data tab. * Fix a bug in box plots when trying to label outliers when there aren't any. -- Graham Williams Sun, 19 Sep 2010 05:01:51 +1000 rattle (2.5.41) unstable; urgency=low * Use GtkBuilder for Export dialog. * Test use of glade vs GtkBuilder on multiple platforms. * Rename rattle.info to rattle.version. * Add weight column to data tab. * Support weights for nnet, multinom, survival. * Add weights information to PMML as a PMML Extension. * Ensure GtkFrame is available as a data type whilst waiting for updated RGtk2. * Bug fix to packageIsAvailable not reruning any result. * Replace destroy with withdraw for plot window as the former has started crashing R. * Improve Log formatting for various model build commands. * Be sure to include the car package for Anova for multinom models. * Release pmml 1.2.24: Bug fix glm binomial regression - note as classification model. -- Graham Williams Wed, 15 Sep 2010 14:56:09 +1000 rattle (2.5.40) unstable; urgency=low * Conditionalise useGtkBuilder: if windows and R before 2.12.0 then libglade if unix and R 2.12.0 then libglade for now (RGtk2 update needed?) all else use GtkBuilder -- Graham Williams Sun, 22 Aug 2010 12:02:00 +1000 rattle (2.5.39) unstable; urgency=low * Conditionally use either libglade2 or GtkBuilder for the GUI. libglade2 (a separate library to the Gtk+ library) is deprecated and as of R 2.12.0 won't be supported on MS/Windows binary builds. The default is now GtkBuilder (built into the Gtk+ library), and support for libglade2 within Rattle is deprecated. RGtk2 (2.12.18) still has issues in its support of GtkBuilder and is being actively worked on, but Rattle is currently working around these. -- Graham Williams Sat, 21 Aug 2010 07:47:43 +1000 rattle (2.5.38) unstable; urgency=low * Ensure pmml.ksvm will at least run - though resulting PMML not validated. * Bump pmml version to 1.2.23 -- Graham Williams Fri, 06 Aug 2010 05:56:11 +1000 rattle (2.5.37) unstable; urgency=low * The Predictive tab has gone back to being Model. Not sure which is best. * cranSearch defaults to r-project rather than unimelb. * Migrate from RGtk2DfEdit to its replacement, RGtk2Extras. * Revert cairoDevice to being a Suggests rater than Depends. * Remove redundant CITATION from root of package, as the real one is in inst. -- Graham Williams Sat, 31 Jul 2010 14:34:50 +1000 rattle (2.5.36) unstable; urgency=low * Add Bill Venables' searchCRAN example code. * Improve error message when we find duplicate variable names in a loaded file, which might result when there is no header line. * Add help item for Projects. * On Evaluate with supplied file, use the hdr specified on the Data tab. -- Graham Williams Mon, 12 Jul 2010 06:43:06 +1000 rattle (2.5.35) unstable; urgency=low * Add utility lss function to list object sizes. * Add options text entry for SVM to easily allow other options. * Better formatting of the Log tab. * Use a set.seed for SVM to ensure same model each time. * Add option to random forest to impute missing values rather than simply ignoring the observations. * On Evalaute with supplied file, use the sep specified on the Data tab, thus allowing TXT files. * On loading a new dataset for evaluation be sure to add in any missing columns, and unify the levels. * Improve binning documentation. * Make RGtk2, cairoDevice, colorspace all dependencies so we can get rattle started and then rattle will prompt to install other packages that are mssing when it needs them. -- Graham Williams Thu, 01 Jul 2010 15:34:50 +1000 rattle (2.5.34) unstable; urgency=low * When a package is missing, there is now the option to install it right then, and it continues as normal after it gets installed. * Change Suggests to Depends so all used pacakges get loaded on loading rattle, in an attempt to make it easier to install Rattle. Then the r-cran-rattle package on Debian/Ubuntu will have all required dependencies and a normal install.packages will get all dependencies also, rather than having to use dependencies=c('Depends', 'Suggests'). Penalty is it takes 20 seconds to do 'library(rattle)' on a server and 90 seconds on a netbook - so revert back to not doing this. * Ensure the new train/validate/test scneario is saved across projects. -- Graham Williams Wed, 09 Jun 2010 07:04:08 +1000 rattle (2.5.33) unstable; urgency=low * Bug fix rf.cmd. * Improve scoring functionality: The dataset can have NA's for target, and these can now get scored by rf on Evaluate tab. Loading a CSV file to be scored no longer needs to have the target column included (previously it needed to be there and have non-NA values). Thanks to Chris Snijders. -- Graham Williams Mon, 31 May 2010 06:22:54 +1000 rattle (2.5.32) unstable; urgency=low * Remove dependency on car - not actually being used at the moment. * For random forest, allow sample size text entry as a single integer or a list, as per randomForest. * Use na.omit with cforest, as is done with randomForest. * For randomForest turn subsampling with replacement off since it is more likely to produce biased importance measures, as explained in by the cforest papers. * Fix bug with multiple "contact support" lines in error popups. * When showing the randomForest importance values, sort on the accuracy measure rather than the Gini measure, since the Gini is biased in favour of categoric variables with many categories. * ada boost seed should be 42, like all other seeds. * Tidy up some ada output. * Bug fix - save project for rf failing (looking for rf_sampsize_entry). * Remove text from toolbar by default. * Change order of Forest/Boost buttons on Model tab. * Add tooltips for all toolbar buttons. -- Graham Williams Fri, 28 May 2010 15:47:15 +1000 rattle (2.5.31) stable; urgency=low * Add rattle.info() to list information for debugging purposes. * Bump pmml to 1.2.22 * Fixes from wenching.lin@zementis.com: Extension in Header should be first element. Coefficients in regression models should not be NA (as will be for singularities), but replace with, and so no impact of change. * Ensure Survival defaults are reset appropriately. -- Graham Williams Wed, 19 May 2010 09:50:39 +1000 rattle (2.5.30) stable; urgency=low * On MS/Windows with Japanese, read.csv needs encoding option set with file rather than with read.csv (for UTF-8) but seems okay under other scenarios. * On MS/Windows with Japanese (UTF-8) the encoding of the variables selected for transforming needs to be UTF-8 for much of the process, but "unknown" when using Rtxt and sprintf (when substituting the variable names) to ensure resulting message is correctly matched for encodings. -- Graham Williams Wed, 19 May 2010 09:47:12 +1000 rattle (2.5.29) stable; urgency=low * Add the translation file. * Fix an Encoding/sprintf issue for Japanese on MS/Windows. * Allow crv$NOTEBOOK.MODEL.NAME to be overridden by other packages (RStat). * When dispatch fails be sure to include the Tab label on which it fails. * Ensure HClust Options are re-enabled on loading a project. -- Graham Williams Sat, 24 Apr 2010 07:32:02 +1000 rattle (2.5.28) stable; urgency=low * Minor format changes for glm and rf model output. * Capture additional survival model error and suggest a solution. * Remove spurious additional plot for Survival Residual plot. * Update log tab labels to be more generic. * Update tooltips to be generic and add survival tooltips. -- Graham Williams Thu, 22 Apr 2010 06:21:58 +1000 rattle (2.5.27) unstable; urgency=low * Further translation fixes. In particular, use Encoding(...getText()) <- "UTF-8" to ensure strings from the GUI ate UTF-8, and not unknown. * Ensure training dataset rather than sample dataset nomenclature is now used. * Ensure execute button can only be clicked once while it is processing. * Survival plot buttons need to be made sensitive as appropriate. * For Japanese on MS/Windows do not use monospace font since this ends up vertically cenbtering periods and commas (and all other characters). Need a fixed width font that does not do this, but for now we put up with variable width font. * Revert to using only English for all hidden tab labels. * Improved identification of current plot number. * Bug fix multiple vars selected for asnumeric and ascategoric transforms. -- Graham Williams Thu, 22 Apr 2010 06:17:20 +1000 rattle (2.5.26) unstable; urgency=low * Add Cross Tab option to Explore tab to generate cross tabulations of each categoric variable by the target variable. (Luke Lake) * Bug fix - improve how we obtain the plot number from the title, particularly in the context of translations. * Further translation markup. * Clean up the use of dfedit. * Minor improvement to spacing in Log tab. -- Graham Williams Tue, 30 Mar 2010 21:37:29 +1100 rattle (2.5.25) unstable; urgency=low * Start using the RGtk2DfEdit for the View and Edit buttons of the Data tab, and the Enter/Score option of the Evaluate tab. RGtk2DfEdit provides a spreadhseet like interface to the data. Various data editing options are available. Also press = to run an arbitrary R command on selected data (e.g. select two columns of data and issue the plot command). * Add further markup of text for translations. * Support specification of the character used for decimal points (to suit some European usage). * Fix bug in exporting XML - replace & with & * Survival plots - split survival chart plot from residuals plots, and plot all residuals. * Fix logic behind what is greyed out in the Test tab. -- Graham Williams Mon, 29 Mar 2010 19:37:25 +1100 rattle (2.5.24) stable; urgency=low * Revamp the help text, and put into the Rtxt translation framework. * Fix the height of the data name widget (the library option was growing the width for some reason). * For Evaluate, add Full and Enter as dataset options. Enter will pop up an editor with the final row from the dataset, allowing you to add rows or modify the supplied row. We supply the row so that we have an example to work from. Full uses the whole original dataset. -- Graham Williams Sat, 06 Mar 2010 14:17:12 +1100 rattle (2.5.23) stable; urgency=low * Catch "arules" error in converting data to transactions when baskets contain repeated items. * When data tab is executed, and so crs$rpart is reset to NULL, always remove the Draw/Rules button from the Tree option of the Predictive tab. * Add code to fix translations that are not being loaded when using RGtk2 on MS/Windows. All is okay on GNU/Linux, but RGtk2 seems not to get the right locale for loaded Glade file. The fix is to traverse the GUI and change all labels, on starting up Rattle. RGtk2 authors tried to fix but it remains an issue. * Ensure rpart is reset on resetting rattle. * Rework handling of tab pages because a Japanese translation on MS/Windows is having issues with the following call (nd=notebook) nb$getTabLabelText(nb$getNthPage(nb$getCurrentPage())) returning what looks like Shift-JIS encoding of the string rather than UTF-8, and hence not string matching the expected tab label. * Fix spelling errors on help menu and ensure help for all topics is covered. * For nnet, use MaxNWts=10000 (default is 1000) to allow larger nets by default, and capture the error message when this is exceeded and better explain what to do. * Ensure we don't export an empty dataset when choosing export on the data tab. * Capture arules error message when there are repeated items in one basket, and explain this more clearly. * For rpart use information as the default split rather than Gini - makes little if any difference. * Allow showHelpPlus to have an extra/alternative question that is displayed. * All random seeds should be 42. * Reset kmeans tab on loading a project. * Add dozen more weather stations to the weatherAUS dataset. * Improve the logic for the display of the Report radio buttons on the Evaluate tab. * Spelling correction to a number of tooltips. -- Graham Williams Wed, 03 Mar 2010 06:50:58 +1100 rattle (2.5.22) stable; urgency=low * Default window height is 650, but not forced so that the window nicely fills a netbook screen if maximised. * Bump R dependency to 2.8.0 in line with update of the CITATION file. -- Graham Williams Sat, 13 Feb 2010 09:48:00 +1100 rattle (2.5.21) stable; urgency=low * Re-enable gettext on MS/Windows, even though RGtk2 2.12.18 has not fixed the bindtextdomain problem with glade files and package supplied translations. * Change the tree plot to us "< =>" and ">= <" to clearly identify which branch the "=" results go. Could not figure out how to get expression to us a "ge" symbol. * Improve formatting of the PvO plots. * Use the pairs.panels function from the psych package for the default scatterplot on the Explore tab. * Add INSTALL file. -- Graham Williams Sun, 07 Feb 2010 15:03:22 +1100 rattle (2.5.20) stable; urgency=low * Restore missing weather.csv file. * Add to Google code: weather.R ChangeLog NEWS ToDo upload_uwe.sh upload_cran.sh. -- Graham Williams Sun, 31 Jan 2010 11:07:55 +1100 rattle (2.5.19) unstable; urgency=low * Ensure the right labels (Time/Risk rather than Class/Prob) displayed in filechooser when exporting a survival model. * Model tab renamed as Predictive. * Ensure boxplots have same "by ..." in the main title. * Update the weather dataset and include many more weather stations in the weatherAUS dataset. * Rtxt does no translations when running on MS/Windows (for now). -- Graham Williams Sat, 30 Jan 2010 09:28:18 +1100