santoku 1.1.0
- Core logic has been speeded up using raw pointers. This was vibe-coded by me and Claude Code. If it breaks, please file a bug report.
- The experimental
chop_spikes()anddissect()functions give common values ofxtheir own singleton intervals. - On Unicode platforms, infinity will be represented as ∞ in breaks. Set
options(santoku.infinity = "Inf")to use the old behaviour. - Singleton breaks are not labelled specially by default in
chop_quantiles(..., raw = FALSE). This means that e.g. if the 10th and 20th percentiles are both the same number, the label will still be[10%, 20%]. - When multiple quantiles are the same, santoku warns and returns the leftmost quantile interval. Before it would merge the intervals, creating labels that might be different to what the user asked for.
-
chop_quantiles()gains arecalc_probsargument.recalc_probs = TRUErecalculates probabilities usingecdf(x), which may give more accurate interval labels. -
single = NULLhas been documented explicitly inlbl_*functions. - Bugfix:
brk_manual()no longer warns ifclose_end = TRUE(the default).
santoku 1.0.0
CRAN release: 2024-06-04
- santoku is now considered stable.
-
chop_quantiles()andbrk_quantiles()gain a newweightsargument, letting you chop by weighted quantiles usingHmisc::wtd.quantile(). -
brk_quantiles()may now return singleton breaks, producing more accurate results whenxhas duplicate elements. - Some deprecated functions have been removed, and the
rawargument tolbl_*functions now always gives a deprecation warning.
santoku 0.10.0
CRAN release: 2023-10-12
- List arguments to
fmtinlbl_*functions will be taken as arguments tobase::format. This gives more flexibility in formatting, e.g.,unitsbreaks. -
chop_n()gains atailargument, to deal with a last interval containing less thannelements. Settail = "merge"to merge it with the previous interval. This guarantees that all intervals contain at leastnelements. -
chop_equally()may return fewer thangroupsgroups when there are duplicate elements. We now warn when this happens. - Bugfix:
chop_n()could return intervals with fewer thannelements when there were duplicate elements. The new algorithm avoids this, but may be slower in this case.
santoku 0.9.1
CRAN release: 2023-03-08
-
endpoint_labels()methods gain an unused...argument to satisfy R CMD CHECK.
santoku 0.9.0
CRAN release: 2022-11-01
Breaking changes
There are important changes to close_end.
-
close_endis nowTRUEby default inchop()andfillet(). In previous versions:chop(1:2, 1:2) ## [1] [1, 2) {2} ## Levels: [1, 2) {2}Whereas now:
chop(1:2, 1:2) ## [1] [1, 2] [1, 2] ## Levels: [1, 2] -
close_endis now always applied afterextend. For example, in previous versions:chop(1:4, 2:3, close_end = TRUE) ## [1] [1, 2) [2, 3] [2, 3] (3, 4] ## Levels: [1, 2) [2, 3] (3, 4]Whereas now:
chop(1:4, 2:3, close_end = TRUE) ## [1] [1, 2) [2, 3) [3, 4] [3, 4] ## Levels: [1, 2) [2, 3) [3, 4]
We changed this behaviour to be more in line with user expectations.
-
If
breakshas names, they will be used as labels:Names can also be used for labels in
probsinchop_quantiles()andproportionsinchop_proportions(). There is a new
rawparameter tochop(). This replaces the parameterrawinlbl_*functions, which is now soft-deprecated.lbl_manual()is deprecated. Just use a vector argument tolabelsinstead.A
labelsargument tochop_quantiles()now needs to be explicitly named.
I expect these to be the last important breaking changes before we release version 1.0 and mark the package as “stable”. If they cause problems for you, please file an issue.
santoku 0.8.0
CRAN release: 2022-06-08
Breaking changes
-
lbl_endpoint()has been renamed tolbl_endpoints(). The old version will trigger a deprecation warning.lbl_endpoints()gainsfirst,lastandsinglearguments like other labelling functions.
Other changes
- New
chop_pretty(),brk_pretty()andtab_pretty()functions usebase::pretty()to calculate attractive breakpoints. Thanks @davidhodge931. - New
chop_proportions(),brk_proportions()andtab_proportions()functions chopxinto proportions of its range. -
chop_equally()now useslbl_intervals(raw = TRUE)by default, bringing it into line withchop_evenly(),chop_width()andchop_n(). - New
lbl_midpoints()function labels breaks by their midpoints. -
lbl_discrete()gains asingleargument. - You can now chop
ts,xts::xtsandzoo::zooobjects. -
chop()is more forgiving when mixing different types, e.g.:-
Dateobjects withPOSIXctbreaks, and vice versa -
bit64::integer64anddoubles
-
- Bugfix:
lbl_discrete()sometimes had ugly label formatting.
santoku 0.7.0
CRAN release: 2022-03-18
Breaking changes
- In labelling functions,
firstandlastarguments are now passed toglue::glue(). Variableslandrrepresent the left and right endpoints of the intervals. -
chop_mean_sd()now takes a vectorsdsof standard deviations, rather than a single maximum numbersdof standard deviations. Write e.g.chop_mean_sd(sds = 1:3)rather thanchop_mean_sd(sd = 3). Thesdargument is deprecated. - The
groupsargument tochop_evenly(), deprecated in 0.4.0, has been removed. -
brk_left()andbrk_right(), deprecated in 0.4.0, have been removed. -
knife(), deprecated in 0.4.0, has been removed. -
lbl_format(), questioning since 0.4.0, has been removed. - Arguments of
lbl_dash()andlbl_intervals()have been reordered for consistency with other labelling functions.
Other changes
- You can now chop many more types, including
unitsfrom theunitspackage,difftimeobjects,package_versionobjects, etc.- Character vectors will be chopped by lexicographic order, with an optional warning.
- If you have problems chopping a vector type, file a bug report.
- The glue package has become a hard dependency. It is used in many places to format labels.
- There is a new
lbl_glue()function using the glue package. Thanks to @dpprdan. - You can now set
labels = NULLto return integer codes. - Arguments
first,lastandsinglecan be used inlbl_intervals()andlbl_dash(), to override the first and last interval labels, or to label singleton intervals. -
lbl_dash()andlbl_discrete()use unicode em-dash where possible. -
brk_default()throws an error if breaks are not sorted.
Bugfixes
- Bugfix:
tab()and friends no longer display anxas the variable name. - Bugfix:
lbl_endpoint()was erroring for some types of breaks.
santoku 0.6.0
CRAN release: 2021-11-04
New arguments
firstandlastinlbl_dash()andlbl_discrete()allow you to override the first and last interval labels.Fixes for CRAN.
santoku 0.5.0
CRAN release: 2020-08-27
- Negative numbers can be used in
chop_width().- This sets
left = FALSEby default. - Also works for negative time intervals.
- This sets
santoku 0.4.0
CRAN release: 2020-06-09
Interface changes
The new version has some interface changes. These are based on user experience, and are designed to make using chop() more intuitive and predictable.
-
chop()has two new arguments,leftandclose_end.- Using
left = FALSEis simpler and more intuitive than wrapping breaks inbrk_right(). -
brk_left()andbrk_right()have been kept for now, but cannot be used to wrap other break functions. - Using
close_endis simpler than passingclose_endintobrk_left()orbrk_right()(which no longer accept this argument directly). -
left = TRUEby default, except for non-numeric objects inchop_quantiles()andchop_equally(), whereleft = FALSEworks better.
- Using
-
close_endis nowFALSEby default.- This prevents user surprises when e.g.
chop(3, 1:3)puts3into a different category thanchop(3, 1:4). -
close_endisTRUEby default forchop_quantiles(),chop_n()and similar functions. This ensures that e.g.chop_quantiles(x, c(0, 1/3, 2/3, 1))does what you would expect.
- This prevents user surprises when e.g.
The
groupsargument tochop_evenly()has been renamed fromgroupstointervals. This should make it easier to remember the difference betweenchop_evenly()andchop_equally(). (Chop evenly intonequal-width intervals, or chop equally intonequal-sized groups.)knife()has been deprecated to keep the interface slim and focused. Usepurrr::partial()instead.
Other changes
-
Date and datetime (
POSIXct) objects can now be chopped.-
chop_width()acceptsdifftime,lubridate::periodorlubridate::durationobjects - all other
chop_functions work as well.
-
Many labelling functions have a new
fmtargument. This can be a string interpreted bysprintf()orformat(), or a 1-argument formatting function for break endpoints, e.g.scales::label_percent().Experimental:
lbl_discrete()for discrete data such as integers or (most) dates.There is a new
lbl_endpoint()function for labelling intervals solely by their left or right endpoint.brk_mean_sd()now accepts non-integer positive numbers.Add
brk_equally()for symmetry withchop_equally().Minor tweaks to
chop_deciles().Bugfix:
lbl_format()wasn’t accepting numeric formats, even whenraw = TRUE. Thanks to Sharla Gelfand.
santoku 0.3.0
CRAN release: 2020-01-24
First CRAN release.
Changed
kut()tokiru().kiru()is an alternative spelling forchop(), for use when the tidyr package is loaded.lbl_sequence()has becomelbl_manual().-
lbl_letters()and friends have been replaced bylbl_seq():- to replace
lbl_letters()uselbl_seq() - to replace
lbl_LETTERS()uselbl_seq("A") - to replace
lbl_roman()uselbl_seq("i") - to replace
lbl_ROMAN()uselbl_seq("I") - to replace
lbl_numerals()uselbl_seq("1") - for more complex formatting use e.g.
lbl_seq("A:"),lbl_seq("(i)")
- to replace
