Книга - Data mining. Textbook

a
A


Warning: mb_convert_encoding(): Unable to detect character encoding in /var/www/u2150601/data/www/ae-books.online/wp-content/themes/twentyfifteen/content.php on line 442
Data mining. Textbook
Sergey Pavlov

Pavel Minakov

Vadim Shmal


Sergey Pavlov MasterPLEKHANOV RUSSIAN UNIVERSITY OF ECONOMICSPavel Minakov Ph. D. Associate ProfessorRUSSIAN UNIVERSITY OF TRANSPORT (MIIT)Vadim Shmal Ph. D. Associate ProfessorRUSSIAN UNIVERSITY OF TRANSPORT (MIIT)





Data mining

Textbook



Vadim Shmal

Pavel Minakov

Sergey Pavlov



Vadim Shmal,2022

Pavel Minakov,2022

Sergey Pavlov,2022



ISBN978-5-0059-4479-5

Created with Ridero smart publishing system




Data mining


Data mining is the process ofextracting and discovering patterns inlarge datasets using methods at the interface ofmachine learning, statistics, and database systems, especially databases containing large numerical values. This includes searching large amounts ofinformation for statistically significant patterns using complex mathematical algorithms. Collected variables include the value ofthe input data, the confidence level and frequency ofthe hypothesis, and the probability offinding arandom sample. It also includes optimizing the parameters toget the best pattern or result, adjusting the input based on some facts toimprove the final result. These parameters include parameters for statistical means such as sample sizes, as well as statistical measures such as error rate and statistical significance.

The ideal scenario for data mining is that the parameters are inorder, which provides the best statistical results with the most likely success values. Inthis ideal scenario, data mining takes place within aclosed mathematical system that collects all inputs tothe system and produces the most likely outcome. Infact, the ideal scenario is rarely found inreal systems. For example, inreal life this does not happen when engineering estimates for areal design project are received. Instead, many factors are used tocalculate the best measure ofsuccess, such as project parameters and the current difficulty ofbringing the project tothe project specifications, and these parameters are constantly changing as the project progresses. While they may be useful incertain situations, such as the development ofspecific products, their values should be subject toconstant re-evaluation depending on the current conditions ofthe project. Infact, the best data analysis happens inacomplex mathematical structure ofproblems with many variables and many constraints, and not inaclosed mathematical system with only afew variables and aclosed mathematical structure.

Data is often collected from many different sources and several different directions. Each type ofdata is analyzed and all ofthat output is analyzed toget an estimate ofhow each piece ofdata may or may not be involved inthe final result. Such analysis is often referred toas the analysis process or data analysis. Data analysis also includes identifying other important information about the database that may or may not have adirect impact on the results. Often, they are also generated from different sources.

Data is usually collected from many different sources and many statistical methods are applied toobtain the best statistical results. The results ofthese methods are often referred toas statistical properties or parameters, and often define mathematical formulas that are intended for the results ofeach mathematical model. Mathematical formulas are often the most important aspects ofthe data analysis process and are usually structured using mathematical formulas known as algorithms. Some mathematical algorithms are based on some theoretical approach or model. Other mathematical algorithms use logic and logical proofs as mathematical tools tounderstand data. Other mathematical algorithms often use computational procedures such as mathematical modeling and mathematical tools tounderstand aparticular problem or data. While such computational procedures may be necessary tocomplete amathematical model ofthe data, such mathematical algorithms may have other mathematical tools that may be more appropriate for the real world. Although these mathematical models are often very complex, it is often easier todevelop amathematical algorithm and model from amathematical model than todevelop amathematical algorithm and model from an actual data analysis process.

Inreality, there are usually anumber ofmathematical models that provide amore complete understanding ofthe situation and data than any one mathematical model or mathematical algorithm. The data is then analyzed and amathematical model ofthe data is often used toderive aspecific parameter value. This parameter value is usually determined bynumerical calculations. If aparameter does not have adirect relationship with the result ofthe final analysis, the parameter is sometimes calculated indirectly using astatistical procedure that yields aparameter that has adirect correlation with the result ofthe data analysis. If aparameter has adirect correlation with the result ofthe data analysis, this parameter is often used directly toobtain the final result ofthe analysis. If the parameter is not directly related tothe result ofthe analysis, the parameter is often obtained indirectly using amathematical algorithm or model. For example, if data analysis can be described byamathematical model, then aparameter can be obtained indirectly using amathematical algorithm or model. It is usually easier toget the parameter directly or indirectly using amathematical algorithm or model.

Bycollecting and analyzing many different kinds ofdata, and performing mathematical analysis on the data, the data can be analyzed and statistics and other statistical tools can be used toproduce results. Inmany cases, the use ofnumerical calculations toobtain real data can be very effective. However, this process usually requires real-world testing before data analysis.




Agent mining


Agent -based mining is an interdisciplinary field that combines multi-agent systems with data mining and machine learning tosolve business problems and solve problems inscience.

Agents can be described as decentralized computing systems that have both computing and communication capabilities. Agents are modeled based on data processing and information gathering algorithms such as agent problem which is amachine learning technique that tries tofind solutions tobusiness problems without any data center.

Agents are like distributed computers where users share computing resources with each other. This allows agents toexchange payloads and process data inparallel, effectively speeding up processing and allowing agents tocomplete their tasks faster.

Acommon use ofagents is data processing and communication, such as the task ofsearching and analyzing large amounts ofdata from multiple sources for specific patterns. Agents are especially efficient because they dont have acentralized server tokeep track oftheir activities.

Currently, there are two technologies inthis area that provide the same functionality as agents, but only one ofthem is widely used: distributed computing, which is CPU-based and often uses centralized servers tostore information; and local computing, which is typically based on local devices such as alaptop or mobile phone, with users sharing information with each other.




Anomaly detection


Indata analysis, anomaly detection (also outlier detection) is the identification ofrare elements, events, or observations that are suspicious because they differ significantly from most ofthe data. One application ofanomaly detection is insecurity or business intelligence as away todetermine the unique conditions ofanormal or observable distribution. Anomalous distributions differ from the mean inthree ways. First, they can be correlated with previous values; second, there is aconstant rate ofchange (otherwise they are an outlier); and third, they have zero mean. The regular distribution is the normal distribution. Anomalies inthe data can be detected bymeasuring the mean and dividing bythe value ofthe mean. Because there is no theoretical upper limit on the number ofoccurrences inadataset, these multiples are counted and represent items that have deviations from the mean, although they do not necessarily represent atrue anomaly.

Data Anomaly Similarities

The concept ofanomaly can be described as adata value that differs significantly from the mean distribution. But the description ofanomalies is also quite general. Any number ofoutliers can occur inadataset if there is adifference between observed relationships or proportions. This concept is best known for observing relationships. They are averaged toobtain adistribution. The similarity ofthe observed ratio or proportion is much less than the anomaly. Anomalies are not necessarily rare. Even when the observations are more similar than the expected values, the observed distribution is not the typical or expected distribution (outliers). However, there is also anatural distribution ofpossible values that observations can fit into. Anomalies are easy tospot bylooking at the statistical distribution ofthe observed data.

Inthe second scenario, there is no known distribution, so it is impossible toconclude that the observations are typical ofany distribution. However, there may be an available distribution that predicts the distribution ofobservations inthis case.

Inthe third scenario, there are enough different data points touse the resulting distribution topredict the observed data. This is possible when using data that is not very normal or has varying degrees ofdeviation from the observed distribution. Inthis case, there is an average or expected value. Aprediction is adistribution that will describe data that is not typical ofthe data, although they are not necessarily anomalies. This is especially true for irregular datasets (also known as outliers).

Anomalies are not limited tonatural observations. Infact, most data inthe business, social, mathematical, or scientific fields sometimes has unusual values or distributions. Toaid decision making inthese situations, patterns can be identified relating todifferent data values, relationships, proportions, or differences from anormal distribution. These patterns or anomalies are deviations ofsome theoretical significance. However, the deviation value is usually so small that most people dont notice it. It can be called outlier, anomaly, or difference, with either term referring toboth the observed data and the possible underlying probability distribution that generates the data.

Assessing data anomalies problem

Now that we know alittle about data anomalies, lets look at how tointerpret the data and assess the possibility ofan anomaly. It is useful toconsider anomalies on the assumption that data is generated byrelatively simple and predictable processes. Therefore, if the data were generated byaspecific process with aknown probability distribution, then we could confidently identify the anomaly and observe the deviation ofthe data.

It is unlikely that all anomalies are associated with aprobability distribution, since it is unlikely that some anomalies are associated. However, if there are any anomalies associated with the probability distribution, then this would be evidence that the data is indeed generated byprocesses or processes that are likely tobe predictable.

Inthese circumstances, the anomaly is indicative ofthe likelihood ofdata processing. It is unlikely that apattern ofdeviations or outliers inthe data is arandom deviation ofthe underlying probability distribution. This suggests that the deviation is associated with aspecific, random process. Under this assumption, anomalies can be thought ofas anomalies inthe data generated bythe process. However, the anomaly is not necessarily related tothe data processing process.

Understanding Data Anomaly

Inthe context ofevaluating data anomalies, it is important tounderstand the probability distribution and its probability. It is also important toknow whether the probability is approximately distributed or not. If it is approximately distributed, then the probability is likely tobe approximately equal tothe true probability. If it is not approximately distributed, then there is apossibility that the probability ofthe deviation may be slightly greater than the true probability. This allows anomalies with larger deviations tobe interpreted as larger anomalies. The probability ofdata anomaly can be assessed using any measure ofprobability, such as sample probability, likelihood, or confidence intervals. Even if the anomaly is not associated with aspecific process, it is still possible toestimate the probability ofadeviation.

These probabilities must be compared with the natural distribution. If the probability is much greater than the natural probability, then there is apossibility that the deviation is not ofthe same magnitude. However, it is unlikely that the deviation is much greater than the natural probability, since the probability is very small. Therefore, this does not indicate an actual deviation from the probability distribution.

Revealing the Data Anomalies Significance

Inthe context ofevaluating data anomalies, it is useful toidentify the relevant circumstances. For example, if there is an anomaly inthe number ofdelayed flights, it may happen that the deviation is quite small. If many flights are delayed, it is more likely that the number ofdelays is very close tothe natural probability. If there are several flights that are delayed, it is unlikely that the deviation is much greater than the natural probability. Therefore, this will not indicate asignificantly higher deviation. This suggests that the data anomaly is not abig deal.

If the percentage deviation from the normal distribution is significantly higher, then there is apossibility that data anomalies are process related, as is the case with this anomaly. This is additional evidence that the data anomaly is adeviation from anormal distribution.

After analyzing the significance ofthe anomaly, it is important tofind out what the cause ofthe anomaly is. Is it related tothe process that generated the data, or is it unrelated? Did the data anomaly arise inresponse toan external influence, or did it originate internally? This information is useful indetermining what the prospects for obtaining more information about the processare.

The reason is that not all deviations are related toprocess variability and affect the process indifferent ways. Inthe absence ofaclear process, determining the impact ofadata anomaly can be challenging.

Analysis ofthe importance ofdata anomalies

Inthe absence ofdeviation from the probability distribution evidence, data anomalies are often ignored. This makes it possible toidentify data anomalies that are ofgreat importance. Insuch asituation, it is useful tocalculate the probability ofdeviation. If the probability is small enough, then the anomaly can be neglected. If the probability is much higher than the natural probability, then it may provide enough information toconclude that the process is large and the potential impact ofthe anomaly is significant. The most reasonable assumption is that data anomalies occur frequently.

Conclusion

Inthe context ofassessing data accuracy, it is important toidentify and analyze the amount ofdata anomalies. When the number ofdata anomalies is relatively small, it is unlikely that the deviation is significant and the impact ofthe anomaly is small. Inthis situation, data anomalies can be ignored, but when the number ofdata anomalies is high, it is likely that the data anomalies are associated with aprocess that can be understood and evaluated. Inthis case, the problem is how toevaluate the impact ofthe data anomaly on the process. The quality ofthe data, the frequency ofthe data, and the speed at which the data is generated are factors that determine how toassess the impact ofan anomaly.

Analyzing data anomalies is critical tolearning about processes and improving their performance. It provides information about the nature ofthe process. This information can be used inevaluating the impact ofthe deviation, evaluating the risks and benefits ofapplying process adjustments. After all, data anomalies are important because they give insight into processes.

The ongoing process ofevaluating the impact ofdata anomalies provides valuable insights. This information provides useful information about the process and provides decision makers with information that can be used toimprove the effectiveness ofthe process.

This approach makes it possible tocreate anomalies inthe data, which makes it possible toevaluate the impact ofthe anomaly. The goal is togain insight into processes and improve their performance. Insuch ascenario, the approach gives aclear idea ofthe type ofprocess change that can be made and the impact ofthe deviation. This can be useful information that can be used toidentify process anomalies that can be assessed toassess the effect ofdeviation. The process ofidentifying process anomalies is very important toprovide valuable data for assessing potential anomalies inprocess performance.

Anomaly analysis is aprocess that estimates the frequency ofoutliers inthe data and compares it tothe background frequency. The criterion for evaluating the frequency ofdata deviation is the greater number ofdata deviations, and not the natural occurrence ofdata anomalies. Inthis case, the frequency is measured bycomparing the number ofdata deviations with the background ofthe occurrence ofdata deviations.

This provides information on how much data deviation is caused bythe process over time and the frequency ofdeviation. It can also provide alink tothe main rejection process. This information can be used tounderstand the root cause ofthe deviation. Ahigher data rejection rate provides valuable insight into the rejection process. Insuch asituation, the risk ofdeviation is likely tobe detected and necessary process changes can be assessed.

Many studies are conducted on the analysis ofdata anomalies toidentify factors that contribute tothe occurrence ofdata anomalies. Some ofthese factors relate toprocesses that require frequent process changes. Some ofthese factors can be used toidentify processes that may be abnormal. Many parameters can be found insystems providing process performance.




Association Rule Learning


Association rule learning is arule-based machine learning technique for discovering interesting relationships between variables inlarge sample databases. This technique is inspired bythe auditory system, where we learn the association rules ofan auditory stimulus and that stimulus alone.

Sometimes when working with adataset, we are not sure if the rows inthe dataset are relevant tothe training task, and if so, which ones. We may want toskip those rows inthe dataset that dont matter. Therefore, associations are usually determined bynon-intuitive criteria, such as the order inwhich these variables appear inasequence ofexamples, or duplicate values inthese data rows.

This problematic aspect oflearning association rules can be eliminated inthe form ofan anomaly detection algorithm. These algorithms attempt todetect non-standard patterns inlarge datasets that may represent unusual relationships between data features. These anomalies are often detected bypattern recognition algorithms, which are also part ofstatistical inference algorithms. For example, the study ofnaive Bayes rules can detect anomalies inthe study ofassociation rules based on avisual inspection ofthe presented examples.

Inalarge dataset, afeature space can represent an area ofan image as aset ofnumbers, inwhich each image pixel has acertain number ofpixels. The characteristics ofan image can be represented as avector, and we can place this vector inthe feature space. If the attribute space is not empty, the attribute will be the number ofpixels inthe image that belong toaparticular color.




Clustering


Clustering is the task ofdiscovering groups and structures indata that are similar tosome extent, not byusing known structures inthe data, but bylearning from what is already there.

Inparticular, clustering is used insuch away that new data points are only added toexisting clusters, without changing their shape tofit the new data. Inother words, clusters are formed before data is collected, rather than fixed after all data is collected.

Given aset ofparameters for data that is (mostly) variable, and their collinearity, clustering can be thought ofas ahierarchical algorithm for finding clusters ofdata points that satisfy aset ofcriteria. Parameters can be grouped into one oftwo categories: parameter values that define the spatial arrangement ofclusters, and parameter values that define relationships between clusters.

Given aset ofparameters for adataset, clustering can be thought ofas discovering those clusters. What parameters do we use for this? The implicit clustering method, which finds the nearest clusters (or, insome versions, clusters more similar toeach other) with the least computational cost, is probably the simplest and most commonly used method for doing this. Inclustering, we aim tokeep the clusters as closely related toeach other as possible whether we do this bytaking more measurements or byusing only acertain technique tocollect data.

But what is the difference between clustering and splitting data into one or more datasets?

The methods ofimplicit clustering and managed clustering are actually very similar. The only difference is that we use different parameters todetermine inwhich direction we should split the data. Take as an example aset ofpoints on asphere that define an interconnected network. Both methods aim tokeep the network as close as possible tothe network defined bythe two nearest points. This is because we dont care if we are very far from one or the other. So, using the implicit clustering algorithm (cluster distance), we will divide the sphere into two parts that define very different networks: one will be the network defined bythe two closest points, and the other will be the network defined bythe two farthest points. The result is two completely separate networks. But this is not agood approach, because the further we move away from the two closest points, the smaller the distance between the points, the more difficult it will be tofind connections between them since there is alimited number ofpoints that are connected byasmall distance.

On the other hand, the method ofcontrolled clustering (cluster distance) would require us tomeasure the length between each pair ofpoints, and then perform calculations that make the networks closest toeach other the smallest distance possible. The result is likely tobe two separate networks that are close toeach other but not exactly the same. Since we need two networks tobe similar toeach other inorder todetect arelationship, it is likely that this method will not work instead, the two clusters will be completely different.

The difference between these two methods comes down tohow we define acluster. The point is that inthe first method (cluster distance) we define acluster as aset ofpoints belonging toanetwork similar toanetwork defined bytwo nearest points. Bythis definition, networks will always be connected (they will be the same distance apart) no matter how many points we include inthe definition. But inthe second method (clustering control), we define clusters as pairs ofpoints that are the same distance from all other points inthe network. This definition can make finding connected points very difficult because it requires us tofind every point that is similar toother points inthe network. However, this is an understandable compromise. Byfocusing on finding clusters with the same distance from each other, we are likely toget more useful data, because if we find connections between them, we can use this information tofind the relationship between them. This means that we have more opportunities tofind connections, which will make it easier toidentify relationships. Bydefining clusters using distance measurements, we ensure that we can find arelationship between two points, even if there is no way todirectly measure the distance between them. But this often results invery few connections inthe data.

Looking at the example ofcreating two datasets one for implicit clustering and one for managed clustering we can easily see the difference between the two methods. Inthe first example, the results may be the same inone case and different inanother. But if the method is good for finding interesting relationships (as it usually is), it will give us useful information about the overall structure ofthe data. However, if the technique is not good at identifying relationships, then it will give us very little information.

Lets say we are developing asystem for determining the direction ofanew product and want toidentify similar products. Since it is not possible tomeasure the direction ofaproduct outside the system, we will have tofind relationships between products based on information about their names. If there is agood rule that we can use toestablish relationships between similar products, then this information is very useful as it allows us tofind interesting relationships (byidentifying similar products that appear close toeach other). However, if the relationship between two products isnt very obvious, its likely that its just an unrelated relationship which means the feature detection method we choose may not matter much. On the other hand, if the relationship is not very obvious but extremely useful (as inthe example above), then we can start tolearn how the product name is related tothe process the product went through. This is an example ofhow different methods can produce very different results.

Unlike the characteristics ofdifferent methods, you also have different possible techniques. For example, when Isay that my system uses image recognition, it doesnt necessarily mean that the process the product goes through uses image recognition. If there are product images that we have taken inthe past, or if we have captured some input from aproduct image, the resulting system will probably not use image recognition. It could be something completely different something much more complex. Each ofthese methods is capable ofidentifying very different things. The result may depend on the characteristics ofthe actual data or on the data used. This means its not enough tolook at aspecific type oftool we also need tolook at what type oftool will be used for aparticular type ofprocess. This is an example ofhow data analysis should not be focused only on the problem being solved. Most likely, the system goes through many different processes, so we need tolook at how different tools will be used tocreate arelationship between two points, and then decide which type ofdata toconsider.

Often, we will be more concerned with how the method will be applied. For example, we might want tosee what type ofdata is most likely tobe useful for finding arelationship. We see that there is not much difference inhow natural language processing is applied. This means that if we want tofind arelationship, natural language processing is agood choice. However, natural language processing does not solve every possible relationship. Natural language processing is often useful when we want totake ahuge number ofsmall steps, but natural language processing does nothing when we want togo really deep. Alook at natural language processing allows you toestablish relationships between data that cannot be done using other methods. This is one ofthe reasons why natural language processing can be useful but not necessary.

However, natural language processing often doesnt find as strong connections as image recognition because natural language processing focuses on simpler data whereas image recognition looks at very complex data. Inthis case, natural language processing is not very good, but can still be useful. Considering natural language processing is not always the best way tosolve aproblem. Natural language processing can be useful if the data is simple, but sometimes it is not possible towork with very complex data.

This example can be applied tomany different types ofdata, but natural language processing is generally more useful for natural language data such as text files. For more complex data (such as images), natural language processing is often not enough. If there is aproblem with natural language processing, it is important toconsider other methods such as detecting words and determining what data is actually stored inan image. This data type will require adifferent data structure tofind the relationship.

With the increasing complexity oftechnology, we often dont have time tolook at the data were looking at. Even if we look at the data, we may not find agood solution, because we have alarge number ofoptions, but not much time toconsider them all. This is why many companies have adata scientist who can make many different decisions and then decide what works best for the data.




Classification


Classification is the task ofgeneralizing aknown structure tobe applied tonew data. For example, an email program might try toclassify an email as legitimate, or spam, or maybe deleted bythe administrator, and if it does this correctly, it can mark the email as relevant tothe user.

However, for servers, the classification is more complex because storage and transmission are far away from users. When servers consume huge amounts ofdata, the problem is different. The job ofthe server is tocreate astore and pass that store around so that servers can access it. Thus, servers can often avoid disclosing particularly sensitive data if they can understand the meaning ofthe data as it arrives, unlike the vast pools ofdata often used for email. The problem ofclassification is different and needs tobe approached differently, and current classification systems for servers do not provide an intuitive mechanism for users tohave confidence that servers are classifying their data correctly.

This simple algorithm is useful for classifying data indatabases containing millions or billions ofrecords. The algorithm works well, provided that all relationships inthe data are sufficiently different from each other and that the data is relatively small inboth columns and rows. This makes data classification useful insystems with relatively little memory and little computation, and therefore the classification oflarge datasets remains amajor unsolved problem.

The simplest classification algorithm for classifying data is the total correlation method, also known as the correlation method. Infull correlation, you have two sets ofdata and you are comparing data from one set todata from another set. This is easy todo for individual pieces ofdata. The next step is tocalculate the correlation between the two datasets. This correlation oftwo sets ofdata tells you what percentage ofthe data is ineach set. Thus, using this correlation, you can classify data as either one set or the other, indicating the parts ofthe data set that come from one set or the other.

This simple method often works well for data stored insimple databases with asmall amount ofdata and slow data access speeds. For example, adatabase system may use atree structure tostore data, with the columns ofarecord representing fields inthe structure. This structure did not allow data tobe ranked because the data would be intwo separate rows ofthe tree structure. This makes it impossible tomake sense ofthe data if the data fits inonly one tree structure. If the database has two data trees, you will need tocompare each ofthe two trees. If there were alarge number oftrees, the comparison could be computationally expensive.

Therefore, full correlation is apoor classification method. Data correlation does not distinguish between relevant parts ofthe data, and the data is relatively small inboth columns and rows. These problems make full correlation unsuitable for simple data classification systems and data storage systems. However, if the data is relatively large, full correlation can be applied. This example is useful for storage systems with arelatively high computational load.



Combining adata classification method with adata storage system improves both performance and usability. Inparticular, the size ofthe resulting classification algorithm is largely independent ofthe size ofthe data store. The detailed classification algorithm does not require alot ofmemory tostore data at all. It is often small enough tobe buffered, and many organizations store their classification systems this way. Also, the performance characteristics ofthe storage system do not depend on the classifier. The storage system can handle data with ahigh degree ofvariability.

Why are classification systems not so good?

Most storage systems do not have agood classifier, and the data classification system is unlikely toget better over time. If your storage system does not have agood classifier, your classification system will have problems.

Most companies dont think this way about their storage systems. Instead, they assume that the system can be fixed. They see it as something that can be improved over time based on future maintenance efforts. This belief also makes it easy tofix some ofthe problems that come from bad storage systems. For example, astorage system that doesnt accept overly short or jumbled data can be improved over time if more people are involved infixingit.




.


.

, (https://www.litres.ru/pages/biblio_book/?art=68765874) .

Visa, MasterCard, Maestro, , , , PayPal, WebMoney, ., QIWI , .



Sergey Pavlov Master PLEKHANOV RUSSIAN UNIVERSITY OF ECONOMICS Pavel Minakov Ph. D. Associate Professor RUSSIAN UNIVERSITY OF TRANSPORT (MIIT) Vadim Shmal Ph. D. Associate Professor RUSSIAN UNIVERSITY OF TRANSPORT (MIIT)

Как скачать книгу - "Data mining. Textbook" в fb2, ePub, txt и других форматах?

  1. Нажмите на кнопку "полная версия" справа от обложки книги на версии сайта для ПК или под обложкой на мобюильной версии сайта
    Полная версия книги
  2. Купите книгу на литресе по кнопке со скриншота
    Пример кнопки для покупки книги
    Если книга "Data mining. Textbook" доступна в бесплатно то будет вот такая кнопка
    Пример кнопки, если книга бесплатная
  3. Выполните вход в личный кабинет на сайте ЛитРес с вашим логином и паролем.
  4. В правом верхнем углу сайта нажмите «Мои книги» и перейдите в подраздел «Мои».
  5. Нажмите на обложку книги -"Data mining. Textbook", чтобы скачать книгу для телефона или на ПК.
    Аудиокнига - «Data mining. Textbook»
  6. В разделе «Скачать в виде файла» нажмите на нужный вам формат файла:

    Для чтения на телефоне подойдут следующие форматы (при клике на формат вы можете сразу скачать бесплатно фрагмент книги "Data mining. Textbook" для ознакомления):

    • FB2 - Для телефонов, планшетов на Android, электронных книг (кроме Kindle) и других программ
    • EPUB - подходит для устройств на ios (iPhone, iPad, Mac) и большинства приложений для чтения

    Для чтения на компьютере подходят форматы:

    • TXT - можно открыть на любом компьютере в текстовом редакторе
    • RTF - также можно открыть на любом ПК
    • A4 PDF - открывается в программе Adobe Reader

    Другие форматы:

    • MOBI - подходит для электронных книг Kindle и Android-приложений
    • IOS.EPUB - идеально подойдет для iPhone и iPad
    • A6 PDF - оптимизирован и подойдет для смартфонов
    • FB3 - более развитый формат FB2

  7. Сохраните файл на свой компьютер или телефоне.

Видео по теме - Data Mining: The Textbook

Книги автора

Аудиокниги автора

Рекомендуем

Последние отзывы
Оставьте отзыв к любой книге и его увидят десятки тысяч людей!
  • константин александрович обрезанов:
    3★
    21.08.2023
  • константин александрович обрезанов:
    3.1★
    11.08.2023
  • Добавить комментарий

    Ваш e-mail не будет опубликован. Обязательные поля помечены *