Introduction to Binary Search Tree - Learn DSA

Things to be discussed

  • Introduction

  • Binary Search Tree

  • Implementation

  • Example Problem

  • Practice Problems


Introduction

In computer science, a binary search tree, also called an ordered or sorted binary tree, is a rooted binary tree whose internal nodes each store a key greater than all the keys in the node's left subtree and less than those in its right subtree. It is a special case of a binary tree and has many useful applications in the field of computer science.

Img Src: Google Images


Binary Search Tree

The binary search tree is a data structure that quickly allows us to maintain a sorted list of numbers.

  • It is called a binary tree because each tree node has a maximum of two children.

  • It is called a search tree because it can be used to search for the presence of a number in O(log(n)) time.

The properties that separate a binary search tree from a regular binary tree is

  1. All nodes of the left subtree are less than the root node

  2. All nodes of the right subtree are more than the root node

  3. Both subtrees of each node are also BSTs i.e. they have the above two properties


C++ Implementation for Binary Search Tree


    #include <iostream>
    using namespace std;

    class BST
    {
        struct node
        {
            int value;
            node *left;
            node *right;
        };

        node *root;

        node *makeEmpty(node *t)
        {
            if (t == NULL)
                return NULL;
            {
                makeEmpty(t->left);
                makeEmpty(t->right);
                delete t;
            }
            return NULL;
        }

        node *insert(int x, node *t)
        {
            if (t == NULL)
            {
                t = new node;
                t->value = x;
                t->left = t->right = NULL;
            }
            else if (x < t->value)
                t->left = insert(x, t->left);
            else if (x > t->value)
                t->right = insert(x, t->right);
            return t;
        }

        node *findMin(node *t)
        {
            if (t == NULL)
                return NULL;
            else if (t->left == NULL)
                return t;
            else
                return findMin(t->left);
        }

        node *findMax(node *t)
        {
            if (t == NULL)
                return NULL;
            else if (t->right == NULL)
                return t;
            else
                return findMax(t->right);
        }

        node *remove(int x, node *t)
        {
            node *temp;
            if (t == NULL)
                return NULL;
            else if (x < t->value)
                t->left = remove(x, t->left);
            else if (x > t->value)
                t->right = remove(x, t->right);
            else if (t->left && t->right)
            {
                temp = findMin(t->right);
                t->value = temp->value;
                t->right = remove(t->value, t->right);
            }
            else
            {
                temp = t;
                if (t->left == NULL)
                    t = t->right;
                else if (t->right == NULL)
                    t = t->left;
                delete temp;
            }

            return t;
        }

        void inOrder(node *t)
        {
            if (t == NULL)
                return;
            inOrder(t->left);
            cout << t->value << " ";
            inOrder(t->right);
        }

        void preOrder(node *t)
        {
            if (t == NULL)
                return;
            cout << t->value << " ";
            preOrder(t->left);
            preOrder(t->right);
        }

        void postOrder(node *t)
        {
            if (t == NULL)
                return;
            postOrder(t->left);
            postOrder(t->right);
            cout << t->value << " ";
        }

        node *find(node *t, int x)
        {
            if (t == NULL)
                return NULL;
            else if (x < t->value)
                return find(t->left, x);
            else if (x > t->value)
                return find(t->right, x);
            else
                return t;
        }

    public:
        BST()
        {
            root = NULL;
        }

        ~BST()
        {
            root = makeEmpty(root);
        }

        void insert(int x)
        {
            root = insert(x, root);
        }

        void remove(int x)
        {
            root = remove(x, root);
        }

        void display()
        {
            inOrder(root);
            // preOrder(root);
            // postOrder(root);
            cout << endl;
        }

        void search(int x)
        {
            root = find(root, x);
            root == NULL ? cout << "False\n" : cout << "True\n";
        }
    };

    int main()
    {
        BST t;
        t.insert(20);
        t.insert(25);
        t.insert(15);
        t.insert(30);
        t.insert(10);
        t.display();
        t.remove(20);
        t.display();
        t.remove(25);
        t.display();
        t.remove(30);
        t.display();
        t.search(15);
        t.search(12);
        return 0;
    }

Example Problem

Distinct Count (HackerEarth)

Given an array A of N integers, classify it as being Good Bad or Average. It is called Good, if it contains exactly X distinct integers, Bad if it contains less than X distinct integers and Average if it contains more than X distinct integers. 

Input format:
First line consists of a single integer T denoting the number of test cases.
First line of each test case consists of two space separated integers denoting N and X.
Second line of each test case consists of N space separated integers denoting the array elements.

Output format:
Print the required answer for each test case on a new line.

Constraints: 
1T50
1X,N13000
1A[i]109

SAMPLE INPUT
 
4
4 1
1 4 2 5
4 2
4 2 1 5
4 3
5 2 4 1
4 4
1 2 4 5
SAMPLE OUTPUT
 
Average
Average
Average
Good












Problem Solving Approach

Our initial approach would be to initialize an array, insert all elements and then find how many distinct elements are present, but the time complexity of this would be O(n^2). Can we reduce that?

Yes! By using BST (Binary Search Tree), we can bring down the time complexity to O(nlogn). So how to go about with this? Insert the first element (root) in the BST, and from the next element insert only if it's not present in the BST (Here, the search takes only O(logn) as compared to O(n) in our initial approach).


    // HackerEarth

    #include <iostream>
    using namespace std;

    struct node
    {
        int data;
        node *left, *right;
    };

    node *newNode(int data)
    {
        node *temp = (node *)malloc(sizeof(node));
        temp->data = data;
        temp->left = temp->right = NULL;

        return temp;
    }

    node *insertValue(node *root, int value)
    {
        if (root == NULL)
            return newNode(value);
        else if (root->data >= value)
            root->left = insertValue(root->left, value);
        else
            root->right = insertValue(root->right, value);

        return root;
    }

    bool searchValue(node *root, int value)
    {
        if (root == NULL)
            return false;
        else if (root->data == value)
            return true;
        else if (root->data > value)
            return searchValue(root->left, value);
        else
            return searchValue(root->right, value);
    }

    void inOrder(node *root)
    {
        if (root != NULL)
        {
            inOrder(root->left);
            cout << root->data << " ";
            inOrder(root->right);
        }
    }

    int main()
    {
        int t;
        cin >> t;

        while (t--)
        {
            int count = 1, x, value, n;
            node *root = NULL;

            cin >> n >> x;
            for (int i = 0; i < n; ++i)
            {
                cin >> value;

                if (i == 0)
                    root = insertValue(root, value);
                else if (!searchValue(root, value))
                    insertValue(root, value), count += 1;
            }

            if (count == x)
                cout << "Good\n";
            else if (count > x)
                cout << "Average\n";
            else
                cout << "Bad\n";
        }

        return 0;
    }

But, this is not the only way to solve this problem. Even if you've solved this question, try to solve this question by using an STL container called set.

What a set does is, it removes duplicates. So, when you're scanning the array, you could simply keep them in the set and use the size of the set to check if it's possible. The complexity? Well, that's for you to find out - how much have you managed to reduce it.

Also, there is another sleek way to solve this problem with complexity O(n) if we don't consider STL complexity. You can simply use an STL container called map.

It has a property that, we can access any value using its key by using the [] operator. In case that key is non-existent, it creates a new instance of that key in the map and returns a reference to it. That's what we need. Simply, use the [] operator for every value. If that value exists, it will simply return its value, otherwise, a new key will be generated. Now at the end just check the size of the map, and decide what do you need to output, according to the given problem statement.


Practice Problems

  1. The Playboy Chimp (Online Judge)

  2. Distance in a Tree (Codechef)

  3. Valid Sets (Codechef)

  4. Ant on the Tree (Codechef)

  5. Apple Tree (Codechef)


The Related Articles which you should read after this


If there is any error or you have any doubt in the article then do ask us through the comment section, we will be happy to help you.

If you think that this article has helped you to learn something then do share this with your friends.

Thank You for Reading.