VC dimension of ellipsoids

Reading time: 6 minute
...

📝 Original Info

  • Title: VC dimension of ellipsoids
  • ArXiv ID: 1109.4347
  • Date: 2011-09-21
  • Authors: Yohji Akama and Kei Irie

📝 Abstract

We will establish that the VC dimension of the class of d-dimensional ellipsoids is (d^2+3d)/2, and that maximum likelihood estimate with N-component d-dimensional Gaussian mixture models induces a geometric class having VC dimension at least N(d^2+3d)/2. Keywords: VC dimension; finite dimensional ellipsoid; Gaussian mixture model

💡 Deep Analysis

Figure 1

📄 Full Content

For sets X ⊆ R d and Y ⊆ X, we say that a set B ⊆ R d cuts Y out of X if Y = X ∩ B. A class C of subsets of R d is said to shatter a set X ⊆ R d if every Y ⊆ X is cut out of X by some B ∈ C. The vc dimension of C, denoted by VCdim(C), is defined to be the maximum n (or ∞ if no such maximum exists) for which some subset of R d of cardinality n is shattered by C.

The vc dimension of a class describes a complexity of the class, and are employed in empirical process theory [4], statistical and computational learning theory [8,3] and discrete geometry [6]. Although asymptotic estimates of vc dimensions are given for many classes, the exact values of vc dimensions are known for only a few classes (e.g. the class of Euclidean balls [10], the class of halfspaces [6], and so on).

In Section 2, we prove :

where a covariance matrix of size d is, by definition, a real, positive definite matrix. As in statistical learning theory [8], for a class P of probability density functions we consider the class D (P) of sets {x ∈ R d ; f (x) > s} such that f is any probability density function in P and s is any positive real number. Then D (G d ) is the class of d-dimensional ellipsoids.

For a positive integer N , an N -component d-dimensional Gaussian mixture model [7] ( (N, d)-gmm ) is, by definition, any probability distribution belonging to the convex hull of some N d-dimensional Gaussian distributions. Suppose we are given a sample from a population (N, d)-gmm but the number N of the components is unknown. To select N from the sample is an example of Akaike’s model selection problem [1] (see [5] for recent approach). The authors of [9] proposed to choose N by structural risk minimization principle [8], where an important role is played by the vc dimension of the class D ((G d ) N ) with (G d ) N being the class of (N, d)-gmms. Our result is that the vc dimension of D ((G d ) N ) is greater than or equal to N (d 2 + 3d)/2.

We will prove Theorem 1. For a positive integer B, a vector a ∈ R B \ { 0}, and c ∈ R, we write an affine function ℓ a,c (x) := t ax + c (x ∈ R B ) and an open halfspace H a,c := {x ∈ R B ; ℓ a,c (x) < 0}. We say a set W ⊆ R B spans an affine subspace H ⊆ R B , if H is the smallest affine subspace that contains W . The cardinality of a set S is denoted by |S|. For a vector a = t (a 1 , . . . , a

Proof. By an affine transformation we can assume without loss of generality that all the components of the vector a are 1 and that S is the canonical basis {e Proof. Let B be the right-hand side. Let ϕ be a map S d-1 → R B which maps

there is some set S ⊂ S d-1 such that |S| = B and ϕ(S) spans the hyperplane. Let a ∈ R B be a vector with the first d components being 1 and the other components being 0. By Lemma 2, for any ε > 0 the family

. By the definition of ϕ, the class of sets defined by quadratic inequalities

But, when ε is sufficiently small, all of these sets are ellipsoids.

We verify the converse inequality.

Below, the convex hull of a set A is denoted by conv(A).

If there are x = (u, x B ), y = (u, y B ) ∈ S such that x B < y B , then for any a ∈ R B with the last component nonnegative and for any c ∈ R we have ℓ a,c (x) < ℓ a,c (y), and thus x ∈ H a,c = {x ∈ R B ; ℓ a,c (x) < 0} whenever y ∈ H a,c . This contradicts the assumption “C shatters S.” Therefore, for the canonical projection π :

By applying Radon’s theorem 1 [6] to the set π(S) ⊂ R B-1 , there is a partition (T 1 , T 2 ) of S such that we can take y from conv(π(T 1 )) ∩ conv(π(T 2 )). Then we see that there are z, z ′ ∈ R such that (y, z) ∈ conv(T 1 ) and (y, z ′ ) ∈ conv(T 2 ). Because C shatters S, there are some a ∈ R B and some c ∈ R such that the last component a B of a is nonnegative and a halfspace H a,c ∈ C cuts T 1 out of S. Thus, we have ℓ a,c (x) < 0 for all x ∈ conv(T 1 ) while ℓ a,c (x) ≥ 0 for all x ∈ conv(T 2 ) where T 2 = S \ T 1 . Therefore ℓ a,c (y, z) < ℓ a,c (y, z ′ ) and a B > 0, we have z ′ > z. On the other hand, some member H a ′ ,c ′ ∈ C cuts T 2 out of S. By a similar reasoning, we have z > z ′ , which is a contradiction.

Proof. Let 0 ∈ conv(A). Then for every finite subset A ′ of A, 0 / ∈ conv(A ′ ) and there is a hyperplane J through 0 such that conv(A ′ ) is contained in one of the two open halfspaces determined by J. So there is a new rectangular coordinate system such that the origin point is the same as the older rectangular coordinate system, one of the new coordinate axes is normal to J, and any a ∈ A ′ is represented as (a 1 , . . . , a B ) with a B > 0. So VCdim({H a,c } a∈A ′ ,c∈R ) ≤ B by Lemma 4, and thus VCdim({H a,c } a∈A,c∈R ) ≤ B.

The proof of Theorem 1 is as follows: By Lemma 3, we have only to establish that the class of d-dimensional ellipsoids has vc dimension less than or equal to B := (d 2 + 3d)/2. Assume otherwise. For a = t (a 1 , . . . , a B ) ∈ R B and x = t (x 1 , . . . , x d ), define a quadratic form q a (x) and a quadratic polynomial p a (x) by

Let A be the set of a ∈ R B s

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut