How to label data index with count using 3D histogram in Matlab -
i have set of data points (around 20000) x,y values , want remove points not close other points. try approach 'digitizing' , think closest way implement in matlab 3d histogram can remove points in low-count bins. used hist3() problems couldn't index of points labeled counts (like output 'ind' histc()). way can think of nested loop last thing want try. there way can label points index or other approach this?
thanks
i feel need clarification
have histogram graph data generated @rayryeng
there bins have n=0 or n=1 want remove data in these bins. histc() there form of output [bincounts,ind]= histc( ) ind returns bin numbers data falls into. can find index of bins less/equal or larger 1, find data in particular bins. there similar thing can 2d inputs?
thanks again
hist3
should able accomplish you. i'm not quite sure problem is. can call hist3
so:
[n,c] = hist3(x);
this automatically partition dataset 10 x 10 grid of equally spaced containers. can override behaviour doing:
[n,c] = hist3(x, nbins);
nbins
2 element array first element tells how many bins want vertically , second element how many bins horizontally.
n
tell how many elements fall within each location of grid , c
give 1 x 2 cell array first element of cell array gives x
co-ordinates of each centre of bin while second element of cell array gives y
co-ordinates of each centre of bin.
to explicit, if have 10 x 10 grid, c
contain 2 element cell array each element 10 elements long. each x
co-ordinate of centre found in c{1}
, have 10 corresponding y
co-ordinates relate bin's centre in c{2}
. means first 10 bin centres located @ c{1}(1), c{2}(1), c{1}(1), c{2}(2), c{1}(1), c{2}(3), ..., c{1}(1), c{2}(10)
, next 10 bin centres located at: c{1}(2), c{2}(1), c{1}(2), c{2}(2), c{1}(2), c{2}(3), ..., c{1}(1), c{2}(10)
.
as quick example, let's on grid between [0,1]
on x-axis , [0,1]
on y-axis. i'm going generate 100 2d points. let's decompose image 10 bins horizontally , 10 bins vertically (as per default of hist3
).
rng(100); %// set seed reproducibility = rand(100,2); [n,c] = hist3(a); disp(n); celldisp(c);
we get:
n = 1 2 0 1 2 0 1 0 1 1 0 1 1 1 1 1 0 0 2 5 0 4 1 1 1 1 1 4 0 1 2 0 3 2 2 1 1 0 2 1 0 0 0 0 1 1 1 0 0 1 1 1 1 2 1 1 0 2 0 1 1 0 2 1 2 0 3 1 1 1 0 1 0 0 0 1 1 0 0 1 1 0 1 2 3 3 0 0 0 2 0 2 1 1 0 1 0 3 0 1 c{1} = columns 1 through 7 0.0541 0.1528 0.2516 0.3503 0.4491 0.5478 0.6466 columns 8 through 10 0.7453 0.8440 0.9428 c{2} = columns 1 through 7 0.0513 0.1510 0.2508 0.3505 0.4503 0.5500 0.6498 columns 8 through 10 0.7495 0.8493 0.9491
this tells first grid located @ top left corner of our point distribution has 1 value logged it. next grid after has 2 bins logged in , on , forth. have our bin centres each of bins shown in c
. remember, have 10 x 10 possible bin centres. if want display our data bin locations, can do:
[x,y] = meshgrid(c{1},c{2}); plot(a(:,1), a(:,2), 'b*', x(:), y(:), 'r*'); grid;
we get:
the red stars denote bin centres while blue stars denote our data points within grid. because our origin on bottom left corner of our plot, origin of n
matrix @ top left corner (i.e. first bin decomposed @ top left while in our data it's @ bottom left corner), need rotate n
90 degrees counter-clockwise origins of each of matrices agree each other, , agree plot. such:
nrot = rot90(n); disp(nrot); nrot = 1 5 1 1 1 1 1 1 2 1 1 2 0 2 0 0 1 0 0 0 0 0 4 0 0 2 1 0 0 3 1 0 1 1 1 0 3 1 0 0 0 1 1 1 1 1 0 1 3 1 2 1 1 2 1 1 2 0 3 0 1 1 1 2 0 2 1 0 2 1 0 1 1 3 0 1 2 0 1 1 2 1 4 0 0 1 0 1 0 2 1 0 0 2 0 1 1 0 1 0
as can see picture, agrees see within (rotated) n
matrix bin centres c
. using n
(or nrot
if convention correct), can figure out points eliminate array of points. points have low membership within n
, find points closest bin centre associated grid location in n
, remove them.
as example, supposing bin in first row, second column (of rotated result) 1 want filter out. corresponds (c{1}(2), c{2}(10))
. know need filter out 5 points belong bin centre. therefore:
numpointstoremove = n(2,10); %//or nrot(1,2); %// computes euclidean distance between bin centre every point dists = sqrt(sum(bsxfun(@minus, a, [c{1}(2) c{2}(10)]).^2, 2)); %// find numpointstoremove closest points bin centre , remove [~,ind] = sort(dists); a(ind(1:numpointstoremove,:)) = [];
we sort our distances in ascending order, determine numpointstoremove
closest points bin centre. remove them our data matrix.
if want remove bins have either 0
or 1
count, can find locations, run for
loop , filter accordingly. however, bins have 0
means don't need run through , filter anything, because no points mapped there! need filter out values have 1 in bins. in other words:
[rows, cols] = find(n == 1); index = 1 : numel(rows) row = rows(index); col = cols(index); %// computes euclidean distance between bin centre every point dists = sqrt(sum(bsxfun(@minus, a, [c{1}(row) c{2}(col)]).^2, 2)); %// finds closest point bin centre , remove [~,ind] = min(dists); a(ind,:) = []; end
as can see, similar same procedure above. wish filter out bins have 1 assigned bin, need find minimum distance. remember, don't need process bins have count of 0 can skip those.
Comments
Post a Comment